Traffic accident anticipation aims to accurately and promptly predict the occurrence of a future accident from dashcam videos, which is vital for a safety-guaranteed self-driving system. To encourage an early and accurate decision, existing approaches typically focus on capturing the cues of spatial and temporal context before a future accident occurs. However, their decision-making lacks visual explanation and ignores the dynamic interaction with the environment. In this paper, we propose Deep ReInforced accident anticipation with Visual Explanation, named DRIVE. The method simulates both the bottom-up and top-down visual attention mechanism in a dashcam observation environment so that the decision from the proposed stochastic multi-task agent can be visually explained by attentive regions. Moreover, the proposed dense anticipation reward and sparse fixation reward are effective in training the DRIVE model with our improved reinforcement learning algorithm. Experimental results show that the DRIVE model achieves state-of-the-art performance on multiple real-world traffic accident datasets. The code and pre-trained model will be available upon paper acceptance.
翻译:交通事故预测旨在准确和迅速预测未来事故的发生,用破摄像头视频预测未来事故的发生,这对于安全自驾驶系统至关重要。为了鼓励尽早作出准确的决定,现有方法通常侧重于在未来发生事故之前捕捉空间和时间背景的线索。然而,它们的决策缺乏视觉解释,忽视了与环境的动态互动。在本文中,我们建议用视觉解释来进行深ReIn In In In In In In In Indroged 事故预测,称为Dive。该方法在破像仪观测环境中模拟自下而上和自上而下的视觉关注机制,以便关注区域能够对拟议的随机多任务剂的决定进行直观解释。此外,拟议的密集预期奖励和稀有固定奖励对于用我们改进的强化学习算法来培训DIVive模型是有效的。实验结果显示,Dive模型在多个真实世界交通事故数据集上达到最先进的性能。代码和预先培训模型将在纸面上被接受时提供。