Sie sind hier: Startseite Kolloquium Audio-Visual Learning for Social Telepresence

Audio-Visual Learning for Social Telepresence

Alexander Richard (Reality Labs Research, Meta)

Art des Termins
    Wann 19.10.2022
    von 16:00 bis 17:00
    Wo Raum 0.016
    Termin übernehmen vCal

    These days, physical distance between people is one of the biggest obstacles to maintaining meaningful social relationships with family, friends, and co-workers. Even with today’s technology, remote communication is limited to a two-dimensional audio-visual experience and lacks the availability of a shared, three-dimensional space in which people can interact with each other over the distance. Our mission at Reality Labs Research (RLR) in Pittsburgh is to develop a telepresence system that is indistinguishable from reality, i.e., a system that provides photo- and phono-realistic social interactions in VR. Building such a system requires modeling complex interactions between visual and acoustic signals: the facial expression of an avatar is strongly influenced by the content and tone of their speech, and vice versa, the tone and content of speech is strongly correlated with the facial expression. We demonstrate that these audio-visual relationships can be modeled through a codify-and-resynthesize paradigm for both acoustic and visual outputs, unlocking state-of-the-art systems for face animation and speech enhancement. In the future, these technologies will help build a realistic virtual environment with lifelike avatars that allow for authentic social interactions, connecting people all over the world, anywhere and at any time.