Event cameras are considered to have great potential for computer vision and robotics applications because of their high temporal resolution and low power consumption characteristics. However, the event stream output from event cameras has asynchronous, sparse characteristics that existing computer vision algorithms cannot handle. Spiking neural network is a novel event-based computational paradigm that is considered to be well suited for processing event camera tasks. However, direct training of deep SNNs suffers from degradation problems. This work addresses these problems by proposing a spiking neural network architecture with a novel residual block designed and multi-dimension attention modules combined, focusing on the problem of depth prediction. In addition, a novel event stream representation method is explicitly proposed for SNNs. This model outperforms previous ANN networks of the same size on the MVSEC dataset and shows great computational efficiency.
翻译:活动摄像机被认为具有计算机视觉和机器人应用的巨大潜力,因为它们具有高时分辨率和低耗能特性;然而,事件摄像机的事件流输出具有现有计算机视觉算法无法处理的零星、稀疏的特点。 Spiking神经网络是一种新颖的事件计算模式,被认为非常适合处理事件相机任务。然而,对深层 SNN的直接培训存在退化问题。这项工作通过提出一个具有新颖残余区块设计和多层关注模块的跳跃神经网络结构来解决这些问题,重点是深度预测问题。此外,还明确为SNNS提出了一个新的事件流代表方法。这一模型比以前在MVSEC数据集上同样规模的ANN网络要好,并展示了很高的计算效率。