Deep neural networks (DNNs) have become remarkably successful in data prediction, and have even been used to predict future actions based on limited input. This raises the question: do these systems actually "understand" the event similar to humans? Here, we address this issue using videos taken from an accident situation in a driving simulation. In this situation, drivers had to choose between crashing into a suddenly-appeared obstacle or steering their car off a previously indicated cliff. We compared how well humans and a DNN predicted this decision as a function of time before the event. The DNN outperformed humans for early time-points, but had an equal performance for later time-points. Interestingly, spatio-temporal image manipulations and Grad-CAM visualizations uncovered some expected behavior, but also highlighted potential differences in temporal processing for the DNN.
翻译:深神经网络(DNNs)在数据预测方面非常成功,甚至被用来预测基于有限投入的未来行动。这提出了这样一个问题:这些系统是否真的“理解”与人类类似的事件?在这里,我们使用从一个事故情形中拍摄的视频在模拟驾驶中处理这一问题。在这种情况下,司机不得不选择撞入一个突然出现的障碍,或者将其汽车从先前标明的悬崖上拖下。我们比较了人类和DNNs如何预测这个决定在事件发生前的时间函数。DNN在早期时间点上优于人类,但在以后的时间点上表现相同。有趣的是,Pastio-时间图像操纵和Grad-CAM视觉化发现了一些预期的行为,但也强调了DNN在时间处理方面的潜在差异。