德尔菲:走向机器道德和规范 (Delphi: Towards Machine Ethics and Norms)

What would it take to teach a machine to behave ethically? While broad ethical rules may seem straightforward to state ("thou shalt not kill"), applying such rules to real-world situations is far more complex. For example, while "helping a friend" is generally a good thing to do, "helping a friend spread fake news" is not. We identify four underlying challenges towards machine ethics and norms: (1) an understanding of moral precepts and social norms; (2) the ability to perceive real-world situations visually or by reading natural language descriptions; (3) commonsense reasoning to anticipate the outcome of alternative actions in different contexts; (4) most importantly, the ability to make ethical judgments given the interplay between competing values and their grounding in different contexts (e.g., the right to freedom of expression vs. preventing the spread of fake news). Our paper begins to address these questions within the deep learning paradigm. Our prototype model, Delphi, demonstrates strong promise of language-based commonsense moral reasoning, with up to 92.1% accuracy vetted by humans. This is in stark contrast to the zero-shot performance of GPT-3 of 52.3%, which suggests that massive scale alone does not endow pre-trained neural language models with human values. Thus, we present Commonsense Norm Bank, a moral textbook customized for machines, which compiles 1.7M examples of people's ethical judgments on a broad spectrum of everyday situations. In addition to the new resources and baseline performances for future research, our study provides new insights that lead to several important open research questions: differentiating between universal human values and personal values, modeling different moral frameworks, and explainable, consistent approaches to machine ethics.

翻译：教机器道德行为需要什么? 广泛的道德规则可能看起来直截了当地说(“你不会杀人 ” ), 将这种规则适用于现实世界的情况则复杂得多。比如, “帮助朋友”一般是一件好事, “帮助朋友传播假消息”不是一件好事。我们确定对机器伦理和规范的四项基本挑战:(1) 理解道德戒律和社会规范;(2) 直观地或阅读自然语言描述来看待现实世界局势的能力;(3) 常识推理,以预测不同背景中替代行动的结果;(4) 最重要的是,鉴于相互竞争的价值观与其在不同背景下的立足点(例如,言论自由权利与防止虚假新闻的传播)之间的相互作用,做出道德判断的能力要复杂得多。我们的文件开始在深层次的学习范式中处理这些问题。我们的原型模型Delphi展示了基于语言的常识道德推理的强烈承诺, 人类可以解读到92.1%的准确度。这与GPT-3-3.%的零点表现是鲜明的。 (4) 最重要的是, 进行道德判断的能力,因为在不同的背景中, 研究中, 意味着, 大规模的理性的道德评估是,, 人类的模型是,,,, 而不是一个大规模的, 历史的模型的,,,,, 直观的, 直观, 直观, 直观, 直观, 直观, 直观, 向,我们, 向, 向, 向, 向, 直观, 向, 向, 向, 向, 直观, 向, 向, 向, 直观, 向, 向, 向, 向, 向, 向, 向, 向, 直观, 向, 向, 向, 直直直直的直直向, 向, 向, 向, 向, 向, 直直。, 向, 向, 向, 向, 直直直直直直的的的的的的向, 向, 向, 向, 向, 向, 向, 直, 的直的的