So-called implicit behavioral cloning with energy-based models has shown promising results in robotic manipulation tasks. We tested if the method's advantages carry on to controlling the steering of a real self-driving car with an end-to-end driving model. We performed an extensive comparison of the implicit behavioral cloning approach with explicit baseline approaches, all sharing the same neural network backbone architecture. Baseline explicit models were trained with regression (MAE) loss, classification loss (softmax and cross-entropy on a discretization), or as mixture density networks (MDN). While models using the energy-based formulation performed comparably to baseline approaches in terms of safety driver interventions, they had a higher whiteness measure, indicating higher jerk. To alleviate this, we show two methods that can be used to improve the smoothness of steering. We confirmed that energy-based models handle multimodalities slightly better than simple regression, but this did not translate to significantly better driving ability. We argue that the steering-only road-following task has too few multimodalities to benefit from energy-based models. This shows that applying implicit behavioral cloning to real-world tasks can be challenging, and further investigation is needed to bring out the theoretical advantages of energy-based models.
翻译:所谓的隐含行为克隆与基于能源的模型在机器人操纵任务中显示出了令人乐观的结果。 我们测试了方法的优势,如果该方法的优势是用一个端到端的驱动模型来控制真正的自我驾驶车的驾驶方向。 我们用明确的基线方法对隐含行为克隆方法进行了广泛的比较,所有都拥有相同的神经网络主干结构。 基线明确模型经过了回归(MAE)损失、分类损失(离散时的软体和交叉元素损失)或混合密度网络(MDN)的培训。 使用基于能源的配方的模型在安全驱动器干预方面与基线方法相对应,它们有一个更高的白度测量标准,表明更差。 为了减轻这一差异,我们展示了两种方法可以用来改善方向的顺畅性。 我们确认,基于能源的模型处理多式联运比简单的回归略好一些,但这并没有转化出显著的驱动能力。 我们争辩说,只有指导性的跟踪道路的任务从基于能源的模型中受益的多式联运太少。 这表明,在现实世界模型中应用隐含行为性的行为性克隆可以带来更大的理论优势。