With the Autonomous Vehicle (AV) industry shifting towards machine-learned approaches for motion planning, the performance of self-driving systems is starting to rely heavily on large quantities of expert driving demonstrations. However, collecting this demonstration data typically involves expensive HD sensor suites (LiDAR + RADAR + cameras), which quickly becomes financially infeasible at the scales required. This motivates the use of commodity sensors like cameras for data collection, which are an order of magnitude cheaper than HD sensor suites, but offer lower fidelity. Leveraging these sensors for training an AV motion planner opens a financially viable path to observe the `long tail' of driving events. As our main contribution we show it is possible to train a high-performance motion planner using commodity vision data which outperforms planners trained on HD-sensor data for a fraction of the cost. To the best of our knowledge, we are the first to demonstrate this using real-world data. We compare the performance of the autonomy system on these two different sensor configurations, and show that we can compensate for the lower sensor fidelity by means of increased quantity: a planner trained on 100h of commodity vision data outperforms the one with 25h of expensive HD data. We also share the engineering challenges we had to tackle to make this work.
翻译:随着自动车辆(AV)行业转向机动规划的机械学习方法,自行驾驶系统的性能开始严重依赖大量专家驾驶演示。然而,收集这种示范数据通常涉及昂贵的HD传感器套件(LiDAR+RADAR+相机),在所需规模上迅速在财政上不可行。这促使人们使用商品传感器,如照相机收集数据,这比HD传感器套件便宜,但能提供较低的忠诚度。利用这些传感器来训练AV运动计划设计员,为观察“长尾”驾驶事件开辟了一条财政上可行的道路。正如我们的主要贡献表明,我们有可能用商品视觉数据来训练高性运动规划员,这种数据比受过HD传感器数据培训的规划员要好,费用的一部分。根据我们的知识,我们首先使用真实世界数据来证明这一点。我们比较了这两个不同的传感器配置的自主系统绩效,并表明我们能够通过增加的25号目标数量来补偿较低感官的忠诚度。我们所训练的25号计划也能够通过增加的数据比例来补偿我们所训练的25号工程数据。