Event cameras are novel bio-inspired sensors, which asynchronously capture pixel-level intensity changes in the form of "events". Due to their sensing mechanism, event cameras have little to no motion blur, a very high temporal resolution and require significantly less power and memory than traditional frame-based cameras. These characteristics make them a perfect fit to several real-world applications such as egocentric action recognition on wearable devices, where fast camera motion and limited power challenge traditional vision sensors. However, the ever-growing field of event-based vision has, to date, overlooked the potential of event cameras in such applications. In this paper, we show that event data is a very valuable modality for egocentric action recognition. To do so, we introduce N-EPIC-Kitchens, the first event-based camera extension of the large-scale EPIC-Kitchens dataset. In this context, we propose two strategies: (i) directly processing event-camera data with traditional video-processing architectures (E$^2$(GO)) and (ii) using event-data to distill optical flow information (E$^2$(GO)MO). On our proposed benchmark, we show that event data provides a comparable performance to RGB and optical flow, yet without any additional flow computation at deploy time, and an improved performance of up to 4% with respect to RGB only information.
翻译:事件相机是新颖的生物激励传感器,它以“活动”的形式无休无止地捕捉像素级强度变化。 由于其感知机制,事件相机几乎没有什么活动,没有运动的模糊,时间分辨率非常高,需要的动力和记忆比传统的框架相机要少得多。 这些特点使它们完全适合一些现实应用,如在可磨损装置上以自我为中心的行动识别,即快速相机运动和有限的电力对传统视觉传感器的挑战。然而,不断增长的事件视觉领域迄今忽视了事件相机在此类应用中的潜力。 在本文中,我们表明事件数据是自我中心行动识别的一种非常有价值的模式。 为此,我们引入了N-EPIC-Kitchens,这是大规模 EPIC-Kitchens 数据集的第一个以事件为基础的相机扩展。 在这方面,我们提出了两个战略:(一) 直接处理事件相机数据与传统视频处理结构(仅以美元计2美元(GOG)直接处理,以及(二)利用事件数据来保持光学流信息(E_2美元),但我们利用事件数据来保持光学流动信息(E_GB2美元),在4号(GO)运行中显示一个可比较的运行的运行。