Event-based cameras are raising interest within the computer vision community. These sensors operate with asynchronous pixels, emitting events, or "spikes", when the luminance change at a given pixel since the last event surpasses a certain threshold. Thanks to their inherent qualities, such as their low power consumption, low latency and high dynamic range, they seem particularly tailored to applications with challenging temporal constraints and safety requirements. Event-based sensors are an excellent fit for Spiking Neural Networks (SNNs), since the coupling of an asynchronous sensor with neuromorphic hardware can yield real-time systems with minimal power requirements. In this work, we seek to develop one such system, using both event sensor data from the DSEC dataset and spiking neural networks to estimate optical flow for driving scenarios. We propose a U-Net-like SNN which, after supervised training, is able to make dense optical flow estimations. To do so, we encourage both minimal norm for the error vector and minimal angle between ground-truth and predicted flow, training our model with back-propagation using a surrogate gradient. In addition, the use of 3d convolutions allows us to capture the dynamic nature of the data by increasing the temporal receptive fields. Upsampling after each decoding stage ensures that each decoder's output contributes to the final estimation. Thanks to separable convolutions, we have been able to develop a light model (when compared to competitors) that can nonetheless yield reasonably accurate optical flow estimates.
翻译:以事件为基础的摄像头正在提高计算机视觉界的兴趣。 这些传感器在自上次事件以来某个像素的亮度变化超过某一阈值时, 以无同步的像素、 发布事件或“ spikes” 运行。 由于其内在的特性, 诸如电耗低、 潜伏低和动态范围高, 它们似乎特别适合具有具有具有挑战性时间限制和安全要求的应用。 以事件为基础的传感器非常适合 Spiking Neal 网络( SNNN) 。 因为与神经变异硬件混合的不同步传感器能够产生具有最低电量要求的实时系统。 在这项工作中, 我们寻求开发一个这样的系统, 使用来自 DSEC 数据集的事件传感器数据, 并激活神经网络来估计驱动情景的光流。 我们提议一个像 U- Net 一样的 SNNN( SNN) 能够进行密集的光流估计 。 为了做到这一点, 我们鼓励使用最小的向差向导的传感器和最小的角度, 能够对地面和预测的光流进行最小的光向, 用我们的向后向下流进行对比的模型来训练, 利用不断的向的向流流流数据流 。