How to train glowingly larger vision deep neural networks, i.e., learning representations for images, videos and 3D textured shapes, when there is insufficient labelled data or noisy data? This is a persistent and increasingly urgent challenge amid the rapid development of large-scale vision models. In this tutorial, we will explore a collection of recent research works that address this challenge. These works cover a range of topics, including self-, weakly-, and text-supervised learning, zero-shot learning, test-time adaptation, and data-efficient training approaches. By discussing these approaches, we aim to provide a comprehensive overview of the latest techniques and advancements in the field of training large-scale vision models with limited labeled data.
In detail, our talks will cover and discussions on the following topics:Start Time (France) | End Time (France) | Event |
---|---|---|
8:20:00 AM | 8:30:00 AM | Intro and Opening Remark |
8:30:00 AM | 9:10:00 AM | Invited Talk (Shalini De Mello): Open-Vocabulary Recognition with Large Image-Text Foundational Models |
9:10:00 AM | 9:50:00 AM | Invited Talk (Xin Wang): Data-Efficient Learning via Top-Down Attention Steering |
9:50:00 AM | 10:00:00 AM | Break |
10:00:00 AM | 10:40:00 AM | Invited Talk (Xueting Li): Zero-/One-shot learning for deformable 3D avatars |
10:40:00 AM | 11:20:00 AM | Invited Talk (Varun Jampani): 3D of Everything: Automatic 3D Object Understanding from Internet Image Collections |
11:20:00 AM | 12:00:00 AM | Invited Talk (Xinlei Chen): Self-supervised Learning: Two Known Paradigms and a Less-known Observation |
12:00:00 AM | 1:30:00 PM | Lunch Break |
1:30:00 PM | 1:40:00 PM | Opening Remarks on Data-efficienty Learning |
1:40:00 PM | 2:20:00 PM | Invited Talk (Ishan Misra): Using unlabeled data to scale representations across modalities |
2:20:00 PM | 3:00:00 PM | Invited Talk (Danny Hongxu Yin): Towards Efficient and Reliable Deep Learning |
3:00:00 PM | 3:10:00 PM | Break |
3:10:00 PM | 3:50:00 PM | Invited Talk (Bichen Wu): Data-efficient language-supervised zero-shot visual learning with self-distillation |
3:50:00 PM | 4:20:00 PM | Invited Talk (Rafid Mahmood): Optimizing Data Collection for Machine Learning | 4:20:00 PM | 4:30:00 PM | Closing Remarks |