Advancements in simulation and formal methods-guided environment sampling have enabled the rigorous evaluation of machine learning models in a number of safety-critical scenarios, such as autonomous driving. Application of these environment sampling techniques towards improving the learned models themselves has yet to be fully exploited. In this work, we introduce a novel method for improving imitation-learned models in a semantically aware fashion by leveraging specification-guided sampling techniques as a means of aggregating expert data in new environments. Specifically, we create a set of formal specifications as a means of partitioning the space of possible environments into semantically similar regions, and identify elements of this partition where our learned imitation behaves most differently from the expert. We then aggregate expert data on environments in these identified regions, leading to more accurate imitation of the expert's behavior semantics. We instantiate our approach in a series of experiments in the CARLA driving simulator, and demonstrate that our approach leads to models that are more accurate than those learned with other environment sampling methods.
翻译:在模拟和形式方法引导的环境采样方面的进步已经使得机器学习模型在自主驾驶等许多安全关键场景中得到了严格评估。然而,使用这些环境采样技术来改进学习模型本身尚未充分利用。本文提出了一种新方法,通过利用规范引导采样技术来聚合新环境中的专家数据,以语义感知方式改进了模仿学习模型。具体而言,我们创建一组正式规格作为将可能环境空间分成语义相似的区域的手段,并确定了这个区域中我们学习的模仿行为与专家行为语义最不同的区域的元素。然后,在这些被确定为区域环境上聚合专家数据,导致更准确的模仿专家的行为语义。我们在CARLA驾驶模拟器中实施了我们的方法,并证明我们的方法比其他环境采样方法产生更准确的模型。