The advances in artificial intelligence enabled by deep learning architectures are undeniable. In several cases, deep neural network driven models have surpassed human level performance in benchmark autonomy tasks. The underlying policies for these agents, however, are not easily interpretable. In fact, given their underlying deep models, it is impossible to directly understand the mapping from observations to actions for any reasonably complex agent. Producing this supporting technology to "open the black box" of these AI systems, while not sacrificing performance, was the fundamental goal of the DARPA XAI program. In our journey through this program, we have several "big picture" takeaways: 1) Explanations need to be highly tailored to their scenario; 2) many seemingly high performing RL agents are extremely brittle and are not amendable to explanation; 3) causal models allow for rich explanations, but how to present them isn't always straightforward; and 4) human subjects conjure fantastically wrong mental models for AIs, and these models are often hard to break. This paper discusses the origins of these takeaways, provides amplifying information, and suggestions for future work.
翻译:由深层学习架构促成的人工智能的进步是不可否认的。 在一些情况下,深神经网络驱动模型在基准自主任务中超过了人的水平表现。 但是,这些代理商的基本政策不容易解释。 事实上,鉴于其深层模型,不可能直接理解从观测到任何合理复杂的代理商行动的映射。 将这种辅助技术制作为这些AI系统的“打开黑盒”而同时又不牺牲性能,这是DARPA XAI方案的基本目标。 在通过这个方案的过程中,我们有几个“大图画”外景:(1) 解释需要高度针对其情景;(2) 许多看似高性能的RL代理商极其不易解释,无法修正解释;(3) 因果关系模型允许丰富的解释,但如何展示这些解释并不总是简单明了;(4) 人类主体对AI系统奇特错误的心理模型进行解释,而这些模型往往很难破碎。 本文讨论了这些“ 大图”的起源,为未来工作提供了补充信息和建议。