Current autonomous driving systems are composed of a perception system and a decision system. Both of them are divided into multiple subsystems built up with lots of human heuristics. An end-to-end approach might clean up the system and avoid huge efforts of human engineering, as well as obtain better performance with increasing data and computation resources. Compared to the decision system, the perception system is more suitable to be designed in an end-to-end framework, since it does not require online driving exploration. In this paper, we propose a novel end-to-end approach for autonomous driving perception. A latent space is introduced to capture all relevant features useful for perception, which is learned through sequential latent representation learning. The learned end-to-end perception model is able to solve the detection, tracking, localization and mapping problems altogether with only minimum human engineering efforts and without storing any maps online. The proposed method is evaluated in a realistic urban driving simulator, with both camera image and lidar point cloud as sensor inputs. The codes and videos of this work are available at our github repo and project website.
翻译:目前自主驾驶系统由感知系统和决定系统组成,两者都分为多个子系统,由大量的人类文艺学组成。端对端办法可以清理系统,避免大量人力工程工作,还可以通过增加数据和计算资源获得更好的性能。与决策系统相比,感知系统更适合设计为端对端框架,因为它不需要在线驾驶探索。本文建议对自主驾驶系统采用新的端对端办法。引入了一个潜在空间,以捕捉所有对感知有用的相关特征,通过相继的潜深层代言学习来学习。学习的端对端认识模型能够解决探测、跟踪、本地化和绘图问题,只有最低限度的人类工程努力,并且不将任何地图储存在网上。拟议方法在现实的城市驱动模拟器中加以评价,同时使用相机图像和利达点云作为传感器输入。这项工作的代码和视频可以在我们的像素重组网站和项目网站上查阅。