We cordially invite you to the colloquium of Prof. Dr. Cees G. M. Snoek!
On March 18, 2024, from 1:30 to 2:30 p.m., Prof. Dr. Cees G. M. Snoek, Head of the Video & Image Sense Lab at the University of Amsterdam (Netherlands), will give a research talk (in English).
DThe research talk is dedicated to the topic “What multimodal foundation models cannot perceive “.
Abstract: Multimodal foundation models are a revolutionary class of AI models that provide impressive abilities to generate multimedia content and do so by interactive prompts in a seemingly creative manner. These foundation models are often self-supervised transformer-based models pre-trained on large volumes of data, typically collected from the web. They already form the basis of all state-of-the-art systems in computer vision and natural language processing across a wide range of tasks and have shown impressive transfer learning abilities. Despite their immense potential, these foundation models face challenges in fundamental perception tasks such as spatial grounding and temporal reasoning, have difficulty to operate on low-resource scenarios, and neglect human-alignment for ethical, legal, and societal acceptance. In this talk I will highlight recent work from my lab that identifies several of these challenges as well as ways to update foundation models to address these challenges and to do so in a sustainable way, without the need to retrain from scratch.
The event is free of charge. Interested individuals are warmly invited to attend!