In Actor and Observer we introduced a dataset linking the first and third-person video understanding domains, the Charades-Ego Dataset. In this paper we describe the egocentric aspect of the dataset and present annotations for Charades-Ego with 68,536 activity instances in 68.8 hours of first and third-person video, making it one of the largest and most diverse egocentric datasets available. Charades-Ego furthermore shares activity classes, scripts, and methodology with the Charades dataset, that consist of additional 82.3 hours of third-person video with 66,500 activity instances. Charades-Ego has temporal annotations and textual descriptions, making it suitable for egocentric video classification, localization, captioning, and new tasks utilizing the cross-modal nature of the data.
翻译:在演员和观察者中,我们引入了将第一和第二人视频理解域(Charades-Ego Dataset)连接起来的数据集。在本文中,我们描述了数据集的自我中心方面,并在68.8小时的第一和第三人视频中为Charades-Ego提供了68,536个活动实例,为Charades-Ego提供了68,536个活动实例,使其成为现有最大和最多样化的自我中心数据集之一。Charades-Ego进一步与Charades数据集共享活动类别、脚本和方法,其中包括新增的82.3小时第三人视频和66,500个活动实例。Charades-Ego有时间说明和文字说明,适合于以自我中心视频分类、本地化、字幕说明和利用数据跨模式性质的新任务。