Our cutting-edge research spans many topics in computer vision and multimodal learning. Our current emphasis is on building reliable AI models and advancing multimodal large language models. We work on topics such as grounding, captioning, visual question answering, and multimodal fact-checking.

For the most up-to-date snapshot of our lab’s research topics, see also the lab leads’ Google Scholar (Prof. Anna Rohrbach, Prof. Marcus Rohrbach).

Research Project Areas