Inspired by human visual attention, we introduce a Maximum Entropy Deep Inverse Reinforcement Learning (MEDIRL) framework for modeling the visual attention allocation of drivers in imminent rear-end collisions. MEDIRL is composed of visual, driving, and attention modules. Given a front-view driving video and corresponding eye fixations from humans, the visual and driving modules extract generic and driving-specific visual features, respectively. Finally, the attention module learns the intrinsic task-sensitive reward functions induced by eye fixation policies recorded from attentive drivers. MEDIRL uses the learned policies to predict visual attention allocation of drivers. We also introduce EyeCar, a new driver visual attention dataset during accident-prone situations. We conduct comprehensive experiments and show that MEDIRL outperforms previous state-of-the-art methods on driving task-related visual attention allocation on the following large-scale driving attention benchmark datasets: DR(eye)VE, BDD-A, and DADA-2000. The code and dataset are provided for reproducibility.
翻译:在人类视觉关注的启发下,我们引入了最大负心深反向强化学习(MEDIRL)框架,在即将到来的后端碰撞中模拟驾驶员的视觉关注分配。MEDIRL由视觉、驾驶和关注模块组成。鉴于人类的前视驱动视频和相应的视力固定,视觉和驾驶模块分别提取通用和驾驶专用视觉特征。最后,关注模块学习了由关注驾驶员记录的眼睛固定政策引发的内在任务敏感奖赏功能。MEDIRL利用所学的政策预测驾驶员的视觉关注分配。我们还引入了EyeCar,这是在易发生事故的情况下新的驾驶员视觉关注数据集。我们进行全面实验,并展示了MEDIRL在驱动任务方面比以往最先进的视觉关注分配方法(DR(Yea)VE、BDD-A和DAD-2000)在以下大型驱动关注基准数据集上,对视觉关注度分配的视觉关注程度表现。代码和数据集被提供可复制。