In this work, we tackle two vital tasks in automated driving systems, i.e., driver intent prediction and risk object identification from egocentric images. Mainly, we investigate the question: what would be good road scene-level representations for these two tasks? We contend that a scene-level representation must capture higher-level semantic and geometric representations of traffic scenes around ego-vehicle while performing actions to their destinations. To this end, we introduce the representation of semantic regions, which are areas where ego-vehicles visit while taking an afforded action (e.g., left-turn at 4-way intersections). We propose to learn scene-level representations via a novel semantic region prediction task and an automatic semantic region labeling algorithm. Extensive evaluations are conducted on the HDD and nuScenes datasets, and the learned representations lead to state-of-the-art performance for driver intention prediction and risk object identification.
翻译:在这项工作中,我们处理自动化驾驶系统的两项重要任务,即驾驶者意图预测和从自我中心图像中识别风险对象。我们主要调查一个问题:这两项任务在道路现场一级的代表中什么是好的?我们主张,现场一级的代表必须反映自我驾驶车周围交通场的更高层次的语义和几何表现,同时向目的地采取行动。为此,我们引入了语义区域的代表,这些地区是自我驾驶车辆在采取相应行动(如四路交叉点左倾)时进行访问的地区。我们提议通过新的语义区域预测任务和自动语义区域标记算法学习现场一级的代表。对HDD和nuScenes数据集进行了广泛的评价,而所学的表述导致驾驶员意图预测和风险物体识别方面的最先进的表现。</s>