Today's state of the art visual navigation agents typically consist of large deep learning models trained end to end. Such models offer little to no interpretability about the learned skills or the actions of the agent taken in response to its environment. While past works have explored interpreting deep learning models, little attention has been devoted to interpreting embodied AI systems, which often involve reasoning about the structure of the environment, target characteristics and the outcome of one's actions. In this paper, we introduce the Interpretability System for Embodied agEnts (iSEE) for Point Goal and Object Goal navigation agents. We use iSEE to probe the dynamic representations produced by these agents for the presence of information about the agent as well as the environment. We demonstrate interesting insights about navigation agents using iSEE, including the ability to encode reachable locations (to avoid obstacles), visibility of the target, progress from the initial spawn location as well as the dramatic effect on the behaviors of agents when we mask out critical individual neurons. The code is available at: https://github.com/allenai/iSEE
翻译:今天的艺术视觉导航工具状态通常包括经过培训的大型深层学习模型,最终结束。这些模型几乎无法解释该代理人的学习技能或针对其环境采取的行动。虽然过去的工作曾探讨过深层学习模型的解释,但人们很少注意解释体现的AI系统,这些系统往往涉及环境结构的推理、目标特征和行为结果。在本文件中,我们为目标与目标导航代理引入了Embodied agEnts(iSEE)的可解释系统。我们使用iSEE来探究这些代理人产生的动态表现,以了解有关该代理人和环境的信息。我们对使用iSEE的导航剂表现出了有趣的洞察力,包括能够对可到达的地点进行编码(以避免障碍)、目标的可见度、最初产卵地点的进展以及当我们掩盖关键个体神经元时对代理人行为产生的巨大影响。代码见:https://github.com/allenai/iSEEEEEE。