We argue that an explainable artificial intelligence must possess a rationale for its decisions, be able to infer the purpose of observed behaviour, and be able to explain its decisions in the context of what its audience understands and intends. To address these issues we present four novel contributions. Firstly, we define an arbitrary task in terms of perceptual states, and discuss two extremes of a domain of possible solutions. Secondly, we define the intensional solution. Optimal by some definitions of intelligence, it describes the purpose of a task. An agent possessed of it has a rationale for its decisions in terms of that purpose, expressed in a perceptual symbol system grounded in hardware. Thirdly, to communicate that rationale requires natural language, a means of encoding and decoding perceptual states. We propose a theory of meaning in which, to acquire language, an agent should model the world a language describes rather than the language itself. If the utterances of humans are of predictive value to the agent's goals, then the agent will imbue those utterances with meaning in terms of its own goals and perceptual states. In the context of Peircean semiotics, a community of agents must share rough approximations of signs, referents and interpretants in order to communicate. Meaning exists only in the context of intent, so to communicate with humans an agent must have comparable experiences and goals. An agent that learns intensional solutions, compelled by objective functions somewhat analogous to human motivators such as hunger and pain, may be capable of explaining its rationale not just in terms of its own intent, but in terms of what its audience understands and intends. It forms some approximation of the perceptual states of humans.
翻译:我们争论说,一个可以解释的人工智能必须具备其决定的理由,能够推断出观察到的行为的目的,并且能够根据受众的理解和意图来解释其决定。为了解决这些问题,我们提出了四个新的贡献。首先,我们从概念状态的角度来定义一个武断的任务,并讨论一个可能的解决办法领域的两个极端。第二,我们定义了强化的解决办法。根据某些情报定义,它最理想地描述了任务的目的。一个拥有它的代理人,从这一目的的角度来解释其决定的理由,以硬件为基础的概念符号系统来表达。第三,在表达理由需要自然语言,一种编码和解密的认知状态。我们提出了一个含义理论,即为了获得语言,一个代理人应该模仿一种语言而不是语言本身的极端。如果人类的言论对代理人的目的具有预测价值,那么,它拥有的代理人将具有其目的和认知状态的含义,但以其认知状态来表达它的决定。第三,在概念上的逻辑需要自然语言语言语言、编码和解码状态的表达方式的自然语言的自然语言,我们提出一种含义理论,一个人类的代理人应该用一种语言的表达方式来解释它本身的动机的动机的动机,在人类的形态的形态的形态的形态的形态的形态的形态的形态的形态的表达中,必须把人类的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性和特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性的特性。