Autonomous vehicles must balance a complex set of objectives. There is no consensus on how they should do so, nor on a model for specifying a desired driving behavior. We created a dataset to help address some of these questions in a limited operating domain. The data consists of 92 traffic scenarios, with multiple ways of traversing each scenario. Multiple annotators expressed their preference between pairs of scenario traversals. We used the data to compare an instance of a rulebook, carefully hand-crafted independently of the dataset, with several interpretable machine learning models such as Bayesian networks, decision trees, and logistic regression trained on the dataset. To compare driving behavior, these models use scores indicating by how much different scenario traversals violate each of 14 driving rules. The rules are interpretable and designed by subject-matter experts. First, we found that these rules were enough for these models to achieve a high classification accuracy on the dataset. Second, we found that the rulebook provides high interpretability without excessively sacrificing performance. Third, the data pointed to possible improvements in the rulebook and the rules, and to potential new rules. Fourth, we explored the interpretability vs performance trade-off by also training non-interpretable models such as a random forest. Finally, we make the dataset publicly available to encourage a discussion from the wider community on behavior specification for AVs. Please find it at github.com/bassam-motional/Reasonable-Crowd.
翻译:自主车辆必须平衡一套复杂的目标。 对于如何这样做没有共识, 也没有就指定理想驾驶行为的模式达成共识。 我们创建了一个数据集, 以帮助在有限的操作领域解决其中的一些问题。 数据包含92个交通情景, 每种情景都有多种曲解方式。 多个批注员表示他们喜欢对两种情景进行选择。 我们使用数据比较一个规则手册实例, 规则手册是谨慎手工制作的, 独立于数据集, 有几个可解释的机器学习模型, 如巴耶西亚网络、 决定树 和在数据集上培训的后勤回归模型。 为了比较驾驶行为, 这些模型使用分数, 表明不同情景跨行违反了14个驱动规则中的每一个规则。 规则是可以解释的, 由专题专家设计。 首先, 我们发现这些规则足以让这些模型在数据集上实现高分类准确性。 其次, 我们发现规则手册提供了高可解释性, 但不过分牺牲业绩。 第三, 数据指规则书和规则以及潜在的新规则。 第四, 我们探索了不同情景的评分数, 将数据解读性与森林业绩定义进行更宽泛。 最后, 我们通过随机性 将数据进行不精确性培训, 。