ML-based motion planning is a promising approach to produce agents that exhibit complex behaviors, and automatically adapt to novel environments. In the context of autonomous driving, it is common to treat all available training data equally. However, this approach produces agents that do not perform robustly in safety-critical settings, an issue that cannot be addressed by simply adding more data to the training set - we show that an agent trained using only a 10% subset of the data performs just as well as an agent trained on the entire dataset. We present a method to predict the inherent difficulty of a driving situation given data collected from a fleet of autonomous vehicles deployed on public roads. We then demonstrate that this difficulty score can be used in a zero-shot transfer to generate curricula for an imitation-learning based planning agent. Compared to training on the entire unbiased training dataset, we show that prioritizing difficult driving scenarios both reduces collisions by 15% and increases route adherence by 14% in closed-loop evaluation, all while using only 10% of the training data.
翻译:以 ML 为基础的运动规划是一种很有希望的方法,可以产生表现复杂行为并自动适应新环境的代理商。 在自主驾驶的情况下,对所有现有的培训数据一视同仁是常见的。然而,这种方法所产生的代理商在安全关键环境下没有强有力地发挥作用,这个问题不能简单地通过在成套培训中增加更多的数据来解决。 我们表明,仅使用10%数据组的代理商与受过整个数据集培训的代理商一样,都表现得公正。我们提出了一个预测驾驶状况内在困难的方法,因为从在公共道路上部署的一队自治车辆收集的数据。然后我们证明,这种困难分数可用于零光传输,为基于规划的模拟学习代理商编制课程。与整个不偏倚培训数据集的培训相比,我们表明,优先安排困难的驾驶情形既能减少碰撞15%,又能使封闭路段评价中的路线遵守率提高14%,所有这些都只使用10%的培训数据。