Rescaling Egocentric Vision

Dima Damen,Hazel Doughty,Giovanni Maria Farinella,Antonino Furnari,Evangelos Kazakos,Jian Ma,Davide Moltisanti,Jonathan Munro,Toby Perrett,Will Price,Michael Wray

Rescaling Egocentric Vision

2020

This paper introduces EPIC-KITCHENS-100, the largest annotated egocentric dataset - 100 hrs, 20M frames, 90K actions - of wearable videos capturing long-term unscripted activities in 45 environments. This extends our previous dataset (EPIC-KITCHENS-55), released in 2018, resulting in more action segments (+128%), environments (+41%) and hours (+84%), using a novel annotation pipeline that allows denser and more complete annotations of fine-grained actions (54% more actions per minute). We evaluate the "test of time" - i.e. whether models trained on data collected in 2018 can generalise to new footage collected under the same hypotheses albeit "two years on". The dataset is aligned with 6 challenges: action recognition (full and weak supervision), detection, anticipation, retrieval (from captions), as well as unsupervised domain adaptation for action recognition. For each challenge, we define the task, provide baselines and evaluation metrics. Our dataset and challenge leaderboards will be made publicly available.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

136

References

Citations