In recent years, event cameras (DVS - Dynamic Vision Sensors) have been used in vision systems as an alternative or supplement to traditional cameras. They are characterised by high dynamic range, high temporal resolution, low latency, and reliable performance in limited lighting conditions -- parameters that are particularly important in the context of advanced driver assistance systems (ADAS) and self-driving cars. In this work, we test whether these rather novel sensors can be applied to the popular task of traffic sign detection. To this end, we analyse different representations of the event data: event frame, event frequency, and the exponentially decaying time surface, and apply video frame reconstruction using a deep neural network called FireNet. We use the deep convolutional neural network YOLOv4 as a detector. For particular representations, we obtain a detection accuracy in the range of 86.9-88.9% mAP@0.5. The use of a fusion of the considered representations allows us to obtain a detector with higher accuracy of 89.9% mAP@0.5. In comparison, the detector for the frames reconstructed with FireNet is characterised by an accuracy of 72.67% mAP@0.5. The results obtained illustrate the potential of event cameras in automotive applications, either as standalone sensors or in close cooperation with typical frame-based cameras.
翻译:近年来,事件摄像头(DVS - 动态视觉传感器)一直用于视觉系统,作为传统相机的替代或补充,其特点是高动态范围、高时间分辨率、低潜值和在有限的照明条件下的可靠性能 -- -- 在先进的驱动器协助系统和自驾驶汽车方面特别重要的参数。在这项工作中,我们测试这些相当新的传感器是否可以用于交通信号探测的流行任务。为此,我们分析事件数据的不同表现:事件框架、事件频率和急剧衰减的时间表面,并利用称为FireNet的深层神经网络进行视频框架重建。我们使用深相动神经网络YOLOv4作为探测器。对于特定的显示,我们获得了86.9-88.9% mAP@0.5的检测准确度。使用这些考虑的图像的混凝度,使我们能够以89.9% mAP@0.5的更精确度获得检测器。相比之下,与FireNet重建的框架的探测器的检测器以72.67%的典型相机为精确度来测量,或者与MAP0.5号摄像机的精确度显示,在近距离上显示。