Imagine experiencing a crash as the passenger of an autonomous vehicle. Wouldn't you want to know why it happened? Current end-to-end optimizable deep neural networks (DNNs) in 3D detection, multi-object tracking, and motion forecasting provide little to no explanations about how they make their decisions. To help bridge this gap, we design an end-to-end optimizable multi-object tracking architecture and training protocol inspired by the recently proposed method of interchange intervention training (IIT). By enumerating different tracking decisions and associated reasoning procedures, we can train individual networks to reason about the possible decisions via IIT. Each network's decisions can be explained by the high-level structural causal model (SCM) it is trained in alignment with. Moreover, our proposed model learns to rank these outcomes, leveraging the promise of deep learning in end-to-end training, while being inherently interpretable.
翻译:想象一下作为自控车辆的乘客坠毁。 您不想知道它为什么会发生吗? 当前端到端最理想的三维探测、多物体跟踪和运动预测的深神经网络( DNNs) 几乎无法解释它们是如何做出决策的。 为了帮助弥合这一差距, 我们设计了一个端到端最理想的多物体跟踪架构和培训协议, 由最近提出的互换干预培训方法( IIT ) 所启发。 通过列举不同的跟踪决定和相关推理程序, 我们可以对单个网络进行培训, 以便通过 IIT 解释可能做出的决定。 每个网络的决定都可以通过高层次结构性因果模型( SCM ) 来解释。 此外, 我们提议的模型学会将这些结果排位, 利用在端到端培训中深学习的诺言, 同时具有内在的解释性。