AI:发展和管理不撒谎的AI (Truthful AI: Developing and governing AI that does not lie)

In many contexts, lying -- the use of verbal falsehoods to deceive -- is harmful. While lying has traditionally been a human affair, AI systems that make sophisticated verbal statements are becoming increasingly prevalent. This raises the question of how we should limit the harm caused by AI "lies" (i.e. falsehoods that are actively selected for). Human truthfulness is governed by social norms and by laws (against defamation, perjury, and fraud). Differences between AI and humans present an opportunity to have more precise standards of truthfulness for AI, and to have these standards rise over time. This could provide significant benefits to public epistemics and the economy, and mitigate risks of worst-case AI futures. Establishing norms or laws of AI truthfulness will require significant work to: (1) identify clear truthfulness standards; (2) create institutions that can judge adherence to those standards; and (3) develop AI systems that are robustly truthful. Our initial proposals for these areas include: (1) a standard of avoiding "negligent falsehoods" (a generalisation of lies that is easier to assess); (2) institutions to evaluate AI systems before and after real-world deployment; and (3) explicitly training AI systems to be truthful via curated datasets and human interaction. A concerning possibility is that evaluation mechanisms for eventual truthfulness standards could be captured by political interests, leading to harmful censorship and propaganda. Avoiding this might take careful attention. And since the scale of AI speech acts might grow dramatically over the coming decades, early truthfulness standards might be particularly important because of the precedents they set.

翻译：在许多情形下,谎言 -- -- 使用口头谎言欺骗 -- -- 是有害的。虽然谎言传统上是人类的事情,但进行复杂的口头声明的大赦国际制度正在日益普遍。这就提出了我们应该如何限制AI“谎言”(即积极选择的谎言)造成的伤害的问题。人类的真实性受社会规范和法律(禁止诽谤、伪证和欺诈)的制约。大赦国际和人类之间的分歧提供了一个机会,为AI制定更准确的真实性标准,并随着时间的推移提高这些标准。这可以给公众的认知和经济带来重大好处,并减轻最坏的AI未来的风险。建立AI诚实性的准则或法律需要开展大量工作:(1) 明确查明真实性标准;(2) 建立能够判断这些标准遵守情况的机构;(3) 建立坚定真实的AI系统。我们最初提出的这些领域的建议包括:(1) 避免“极端的谎言”的标准(对谎言的概括性比较容易评估);(2) 机构在现实世界部署之前和之后评价AI系统,以及减少最坏的AI系统的风险。明确培训AI系统,可能最终通过诚实性的标准,通过透明性机制实现。

相关内容

关注 7010

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日