Wearable cameras, smart glasses, and AR/VR headsets are gaining importance for research and commercial use. They feature various sensors like cameras, depth sensors, microphones, IMUs, and GPS. Advances in machine perception enable precise user localization (SLAM), eye tracking, and hand tracking. This data allows understanding user behavior, unlocking new interaction possibilities with augmented reality. Egocentric devices may soon automatically recognize user actions, surroundings, gestures, and social relationships. These devices have broad applications in assistive technology, education, fitness, entertainment, gaming, eldercare, robotics, and augmented reality, positively impacting society.
Previously, research in this field faced challenges due to limited datasets in a data-intensive environment. However, the community's recent efforts have addressed this issue by releasing numerous large-scale datasets covering various aspects of egocentric perception, including HoloAssist, Aria Digital Twin, Aria Synthetic Environments, Ego4D, Ego-Exo4D, and EPIC-KITCHENS.
The goal of this workshop is to provide an exciting discussion forum for researchers working in this challenging and fast-growing area, and to provide a means to unlock the potential of data-driven research with our datasets to further the state-of-the-art.
We welcome submissions to the challenges from March to May (see important dates) through the leaderboards linked below. Participants to the challenges are are requested to submit a technical report on their method. This is a requirement for the competition. Reports should be 2-6 pages including references. Submissions should use the CVPR format and should be submitted through the CMT website.
Challenge ID | Challenge Name | Challenge Lead | Challenge Link |
---|---|---|---|
1 | Action Recognition | Mahdi Rad, Microsoft, Switzerland | Link |
2 | Mistake Detection | Ishani Chakraborty, Microsoft, US | Link |
3 | Intervention Type Prediction | Taein Kwon, ETH Zurich, Switzerland | Link |
Challenge ID | Challenge Name | Challenge Lead | Challenge Link |
---|---|---|---|
1 | Few-shots 3D Object detection & tracking | Xiaqing Pan, Meta, US | Link |
2 | 3D Object detection & tracking | Xiaqing Pan, Meta, US | Link |
Challenge ID | Challenge Name | Challenge Lead | Challenge Link |
---|---|---|---|
1 | Scene Reconstruction using structured language | Vasileios Baltnas, Meta, UK | Link |
Ego4D is a massive-scale, egocentric dataset and benchmark suite collected across 74 worldwide locations and 9 countries, with over 3,670 hours of daily-life activity video. Please find details below on our challenges:
Challenge ID | Challenge Name | Challenge Lead | Challenge Link |
---|---|---|---|
1 | Visual Queries 2D | Santhosh Kumar Ramakrishnan, University of Texas, Austin, US | Link |
2 | Visual Queries 3D | Vincent Cartillier, Georgia Tech, US | Link |
3 | Natural Language Queries | Satwik Kottur, Meta, US | Link |
4 | Moment Queries | Chen Zhao & Merey Ramazanova, KAUST, SA | Link |
5 | EgoTracks | Hao Tang & Weiyao Wang, Meta, US | Link |
6 | Goal Step | Yale Song, Meta, US | Link |
7 | Ego Schema | Karttikeya Mangalam, Raiymbek Akshulakov, UC Berkeley, US | Link |
8 | PNR temporal localization | Yifei Huang, University of Tokyo, JP | Link |
9 | Localization and Tracking | Hao Jiang, Meta, US | Link |
10 | Speech Transcription | Leda Sari Jachym Kolar & Vamsi Krishna Ithapu, Meta Reality Labs, US | Link |
11 | Looking at me | Eric Zhongcong Xu, National University of Singapore, Singapore | Link |
12 | Short-term Anticipation | Francesco Ragusa, University of Catania, IT | Link |
13 | Long-term Anticipation | Tushar Nagarajan, FAIR, US | Link |
Ego-Exo4D is a diverse, large-scale multi-modal multi view video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured ego- centric and exocentric video of skilled human activities (e.g., sports, music, dance, bike repair).
Challenge ID | Challenge Name | Challenge Lead | Challenge Link |
---|---|---|---|
1 | Ego-Pose Body | Pablo Arbelaez & Maria Camila Escobar Palomeque, Universidad de los Andes Colombia | Link |
2 | Ego-Pose Hands | Jianbo Shi, Shan Shu, University of Pennsylvania, US | Link |
Please check the EPIC-KITCHENS website for more information on the EPIC-KITCHENS challenges. Links to individual challenges are also reported below.
Challenge ID | Challenge Name | Challenge Lead | Challenge Link |
---|---|---|---|
1 | Action Recognition | Jacob Chalk, University of Bristol, UK | Link |
2 | Action Anticipation | Antonino Furnari and Francesco Ragusa University of Catania, IT | Link |
3 | Action Detection | Francesco Ragusa and Antonino Furnari, University of Catania, IT | Link |
4 | Domain Adaptation for Action Recognition | Toby Perrett, University of Bristol, UK | Link |
5 | Multi-Instance Retrieval | Michael Wray, University of Bristol, UK | Link |
6 | Semi-Supervised Video-Object Segmentation | Ahmad Dar Khalil, University of Bristol, UK | Link |
7 | Hand-Object Segmentation | Dandan Shan, University of Michigan, US | Link |
8 | EPIC-SOUNDS Audio-Based Interaction Recognition | Jacob Chalk, University of Bristol, UK | Link |
9 | TREK-150 Object Tracking | Matteo Dunnhofer, University of Udine, IT | Link |
You are invited to submit extended abstracts to the first edition of joint egocentric vision workshop which will be held alongside CVPR 2024 in Seattle.
These abstracts represent existing or ongoing work and will not be published as part of any proceedings. We welcome all works that focus within the Egocentric Domain, it is not necessary to use the Ego4D dataset within your work. We expect a submission may contain one or more of the following topics (this is a non-exhaustive list):
The length of the extended abstracts is 2-4 pages, including figures, tables, and references. We invite submissions of ongoing or already published work, as well as reports on demonstrations and prototypes. The 1st joint egocentric vision workshop gives opportunities for authors to present their work to the egocentric community to provoke discussion and feedback. Accepted work will be presented as either an oral presentation (either virtual or in-person) or as a poster presentation. The review will be single-blind, so there is no need to anonymize your work, but otherwise will follow the format of the CVPR submissions, information can be found here. Accepted abstracts will not be published as part of a proceedings, so can be uploaded to ArXiv etc. and the links will be provided on the workshop’s webpage. The submission will be managed with the CMT website.
Challenges Leaderboards Open | Mar 2024 |
Challenges Leaderboards Close | 30 May 2024 |
Challenges Technical Reports Deadline (on CMT) | 5 June 2024 (23:59 PT) |
Extended Abstract Deadline | 10 May 2024 (23:59 PT) |
Extended Abstract Notification to Authors | 29 May 2024 |
Extended Abstracts ArXiv Deadline | 12 June 2024 |
Workshop Date | 17 June 2024 |
All dates are local to Seattle's time, PST.
Workshop Location: Room TBD
A tentative programme is shown below.
Time | Event |
---|---|
08:45-09:00 | Welcome and Introductions |
09:00-09:30 | Invited Keynote 1: Takeo Kanade, Carnegie Mellon University, US |
09:30-10:20 | HoloAssist Challenges |
10:20-11:20 | Coffee Break and Poster Session |
11:20-11:50 | Invited Keynote 2: Diane Larlus, Naver Labs Europe and MIAI Grenoble, FR |
11:50-12:40 | EPIC-KITCHENS Challenges |
12:40-13:40 | Lunch Break |
13:40-14:10 | EgoVis 2022/2023 Distinguished paper Awards |
14:10-14:40 | Invited Keynote 3: Michael C. Frank & Bria Long, Stanford University, US |
14:40-15:30 | Aria Digital Twin & Synthetic Environments Challenges |
15:30-16:00 | Coffee Break |
16:00-16:30 | Invited Keynote 4: Fernando de La Torre, Carnegie Mellon University, US |
16:30-17:40 | Ego4D Challenges |
17:40-18:10 | Invited Keynote 5: Jim Rehg, University of Illinois Urbana-Champaign, US |
18:10-18:15 | Conclusion |
This workshop follows the footsteps of the following previous events:
EPIC-Kitchens and Ego4D Past Workshops:
Human Body, Hands, and Activities from Egocentric and Multi-view Cameras Past Workshops:
Project Aria Past Tutorials: