Reliable perception during fast motion maneuvers or in high dynamic range environments is crucial for robotic systems. Since event cameras are robust to these challenging conditions, they have great potential to increase the reliability of robot vision. However, event-based vision has been held back by the shortage of labeled datasets due to the novelty of event cameras. To overcome this drawback, we propose a task transfer method to train models directly with labeled images and unlabeled event data. Compared to previous approaches, (i) our method transfers from single images to events instead of high frame rate videos, and (ii) does not rely on paired sensor data. To achieve this, we leverage the generative event model to split event features into content and motion features. This split enables efficient matching between latent spaces for events and images, which is crucial for successful task transfer. Thus, our approach unlocks the vast amount of existing image datasets for the training of event-based neural networks. Our task transfer method consistently outperforms methods targeting Unsupervised Domain Adaptation for object detection by 0.26 mAP (increase by 93%) and classification by 2.7% accuracy.
翻译:在快速运动操作或高动态范围内环境中的可靠感知对于机器人系统至关重要。 事件相机对于这些具有挑战性的条件非常强大, 它们具有提高机器人视觉可靠性的巨大潜力。 但是, 事件视觉由于事件相机的新颖性而被标签的数据集短缺所挡住。 为了克服这一缺陷, 我们提议了一个任务传输方法, 直接用标签图像和无标签事件数据来培训模型。 与以往的方法相比, (一) 我们的方法从单个图像向事件转移, 而不是高框架率视频, 以及 (二) 不依赖于配对式传感器数据。 为了实现这一点, 我们利用基因事件模型将事件特性分为内容和动作特征。 这使得事件和图像的潜在空间之间能够有效地匹配, 这对于任务成功转移至关重要。 因此, 我们的方法解开了用于培训事件神经网络的大量现有图像数据集。 我们的任务转移方法始终超越了以0. 26 mAP( 增加93%) 和 分类2.7% 准确性为对象探测目标进行不超超超的Domain适应的方法。