NDLM2023

Overview

How to train glowingly larger vision deep neural networks, i.e., learning representations for images, videos and 3D textured shapes, when there is insufficient labelled data or noisy data? This is a persistent and increasingly urgent challenge amid the rapid development of large-scale vision models. In this tutorial, we will explore a collection of recent research works that address this challenge. These works cover a range of topics, including self-, weakly-, and text-supervised learning, zero-shot learning, test-time adaptation, and data-efficient training approaches. By discussing these approaches, we aim to provide a comprehensive overview of the latest techniques and advancements in the field of training large-scale vision models with limited labeled data.

In detail, our talks will cover and discussions on the following topics:

self-supervised representation learning for images and videos in the 2D domain
zero- and one-shot learning
Open-Vocabulary representation learning
test-time and data-efficient training approaches
data-free learning

Challenges from specific application domain such as autonomous vehicles will be thoroughly discussed.

Tutorial Schedule

Please attend our tutorial via our ICCV tutorial website. Slides and videos will be released right after the tutorial.

Start Time (France)	End Time (France)	Event
8:20:00 AM	8:30:00 AM	Intro and Opening Remark
8:30:00 AM	9:10:00 AM	Invited Talk (Shalini De Mello): Open-Vocabulary Recognition with Large Image-Text Foundational Models
9:10:00 AM	9:50:00 AM	Invited Talk (Xin Wang): Data-Efficient Learning via Top-Down Attention Steering
9:50:00 AM	10:00:00 AM	Break
10:00:00 AM	10:40:00 AM	Invited Talk (Xueting Li): Zero-/One-shot learning for deformable 3D avatars
10:40:00 AM	11:20:00 AM	Invited Talk (Varun Jampani): 3D of Everything: Automatic 3D Object Understanding from Internet Image Collections
11:20:00 AM	12:00:00 AM	Invited Talk (Xinlei Chen): Self-supervised Learning: Two Known Paradigms and a Less-known Observation
12:00:00 AM	1:30:00 PM	Lunch Break
1:30:00 PM	1:40:00 PM	Opening Remarks on Data-efficienty Learning
1:40:00 PM	2:20:00 PM	Invited Talk (Ishan Misra): Using unlabeled data to scale representations across modalities
2:20:00 PM	3:00:00 PM	Invited Talk (Danny Hongxu Yin): Towards Efficient and Reliable Deep Learning
3:00:00 PM	3:10:00 PM	Break
3:10:00 PM	3:50:00 PM	Invited Talk (Bichen Wu): Data-efficient language-supervised zero-shot visual learning with self-distillation
3:50:00 PM	4:20:00 PM	Invited Talk (Rafid Mahmood): Optimizing Data Collection for Machine Learning
4:20:00 PM	4:30:00 PM	Closing Remarks