We argue that the attempt to build morality into machines is subject to what we call the Interpretation problem, whereby any rule we give the machine is open to infinite interpretation in ways that we might morally disapprove of, and that the interpretation problem in Artificial Intelligence is an illustration of Wittgenstein's general claim that no rule can contain the criteria for its own application. Using games as an example, we attempt to define the structure of normative spaces and argue that any rule-following within a normative space is guided by values that are external to that space and which cannot themselves be represented as rules. In light of this problem, we analyse the types of mistakes an artificial moral agent could make and we make suggestions about how to build morality into machines by getting them to interpret the rules we give in accordance with these external values, through explicit moral reasoning and the presence of structured values, the adjustment of causal power assigned to the agent and interaction with human agents, such that the machine develops a virtuous character and the impact of the interpretation problem is minimised.
翻译:我们争论说,将道德建设成机器的尝试要服从我们所谓的解释问题,即我们给机器的任何规则都以在道德上可能不赞成的方式接受无限解释,而人工智能中的解释问题就说明了Wittgenstein的一般主张,即任何规则都无法包含其自身应用的标准。我们以游戏为例,试图界定规范空间的结构,并争论规范空间内任何规则的遵循都以该空间外部的价值观为指导,而这些价值观本身不能被作为规则来代表。有鉴于此,我们分析一个人工道德代理人可以犯的错误类型,并就如何将道德建设成机器提出建议,通过明确的道德推理和结构价值的存在,将我们根据这些外部价值而提出的规则加以解释,调整分配给代理人的因果力量和与人类代理人的互动,使机器发展一个善良的特性和解释问题的影响最小化。