3D hand tracking methods based on monocular RGB videos are easily affected by motion blur, while event camera, a sensor with high temporal resolution and dynamic range, is naturally suitable for this task with sparse output and low power consumption. However, obtaining 3D annotations of fast-moving hands is difficult for constructing event-based hand-tracking datasets. In this paper, we provided an event-based speed adaptive hand tracker (ESAHT) to solve the hand tracking problem based on event camera. We enabled a CNN model trained on a hand tracking dataset with slow motion, which enabled the model to leverage the knowledge of RGB-based hand tracking solutions, to work on fast hand tracking tasks. To realize our solution, we constructed the first 3D hand tracking dataset captured by an event camera in a real-world environment, figured out two data augment methods to narrow the domain gap between slow and fast motion data, developed a speed adaptive event stream segmentation method to handle hand movements in different moving speeds, and introduced a new event-to-frame representation method adaptive to event streams with different lengths. Experiments showed that our solution outperformed RGB-based as well as previous event-based solutions in fast hand tracking tasks, and our codes and dataset will be publicly available.
翻译:基于单体 RGB 视频的3D 手跟踪方法很容易受到运动模糊的影响,而事件相机、具有高时间分辨率和动态范围的传感器,自然适合这项任务,产出稀少,耗电量低。然而,获得快速移动手的3D说明对于构建基于事件的手跟踪数据集来说是困难的。在本文中,我们提供了一种基于事件的速度适应性手追踪器(ESAHT),以解决基于事件相机的手跟踪问题。我们使在以慢动作手跟踪数据集上培训的CNN模型能够使该模型能够利用基于 RGB 的手跟踪解决方案的知识,并开展快速手跟踪任务。为了实现我们的解决方案,我们建造了第一个3D手跟踪数据集,由事件相机在现实世界环境中拍摄,我们想出两种数据增强的方法来缩小慢动数据与快速移动数据之间的差距,我们开发了一种以不同移动速度处理手动动作的快速适应事件流分解方法,我们采用了一种适应不同长度事件流的新的事件到框架代表方法。实验表明,我们的解决方案将超越基于 RGB 和基于事件代码的快速跟踪和基于以往事件解决方案的解决方案。</s>