黑人Box AI系统能力 (Learning User-Interpretable Descriptions of Black-Box AI System Capabilities)

Several approaches have been developed to answer specific questions that a user may have about an AI system that can plan and act. However, the problems of identifying which questions to ask and that of computing a user-interpretable symbolic description of the overall capabilities of the system have remained largely unaddressed. This paper presents an approach for addressing these problems by learning user-interpretable symbolic descriptions of the limits and capabilities of a black-box AI system using low-level simulators. It uses a hierarchical active querying paradigm to generate questions and to learn a user-interpretable model of the AI system based on its responses. In contrast to prior work, we consider settings where imprecision of the user's conceptual vocabulary precludes a direct expression of the agent's capabilities. Furthermore, our approach does not require assumptions about the internal design of the target AI system or about the methods that it may use to compute or learn task solutions. Empirical evaluation on several game-based simulator domains shows that this approach can efficiently learn symbolic models of AI systems that use a deterministic black-box policy in fully observable scenarios.

翻译：开发了几种方法来回答用户可能拥有的关于能够规划和采取行动的AI系统的具体问题,然而,在确定应问的问题和计算用户解释的关于该系统总体能力的象征性描述方面,问题基本上仍未解决,本文件介绍了一种解决这些问题的方法,即学习用户解释的关于使用低级别模拟器的黑盒AI系统的局限性和能力的象征性描述,它使用一个等级积极的查询模式来产生问题,并学习基于其回应的AI系统的用户解释模型。与先前的工作相比,我们考虑了用户概念词汇不精确而无法直接表达代理人能力的情况。此外,我们的方法并不要求假设目标AI系统的内部设计或它可能用来计算或学习任务解决方案的方法。对若干基于游戏的模拟领域的“经验性评估”表明,这种方法可以有效地学习在完全可观察的情景中采用确定型黑盒政策的AI系统的象征性模型。

相关内容

黑盒

关注 1

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

【ICML2021】密度约束强化学习

专知会员服务

22+阅读 · 2021年6月26日

【KDD2020】基于节点-边缘协同解纠缠的可解释深图生成，Interpretable Deep Graph Generation with Node-edge Co-disentanglement

专知会员服务

32+阅读 · 2020年6月11日

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

《可解释的机器学习-interpretable-ml》238页pdf

专知会员服务

208+阅读 · 2020年2月24日