In many contexts, lying -- the use of verbal falsehoods to deceive -- is harmful. While lying has traditionally been a human affair, AI systems that make sophisticated verbal statements are becoming increasingly prevalent. This raises the question of how we should limit the harm caused by AI "lies" (i.e. falsehoods that are actively selected for). Human truthfulness is governed by social norms and by laws (against defamation, perjury, and fraud). Differences between AI and humans present an opportunity to have more precise standards of truthfulness for AI, and to have these standards rise over time. This could provide significant benefits to public epistemics and the economy, and mitigate risks of worst-case AI futures. Establishing norms or laws of AI truthfulness will require significant work to: (1) identify clear truthfulness standards; (2) create institutions that can judge adherence to those standards; and (3) develop AI systems that are robustly truthful. Our initial proposals for these areas include: (1) a standard of avoiding "negligent falsehoods" (a generalisation of lies that is easier to assess); (2) institutions to evaluate AI systems before and after real-world deployment; and (3) explicitly training AI systems to be truthful via curated datasets and human interaction. A concerning possibility is that evaluation mechanisms for eventual truthfulness standards could be captured by political interests, leading to harmful censorship and propaganda. Avoiding this might take careful attention. And since the scale of AI speech acts might grow dramatically over the coming decades, early truthfulness standards might be particularly important because of the precedents they set.
翻译:在许多情形下,谎言 -- -- 使用口头谎言欺骗 -- -- 是有害的。虽然谎言传统上是人类的事情,但进行复杂的口头声明的大赦国际制度正在日益普遍。这就提出了我们应该如何限制AI“谎言”(即积极选择的谎言)造成的伤害的问题。人类的真实性受社会规范和法律(禁止诽谤、伪证和欺诈)的制约。大赦国际和人类之间的分歧提供了一个机会,为AI制定更准确的真实性标准,并随着时间的推移提高这些标准。这可以给公众的认知和经济带来重大好处,并减轻最坏的AI未来的风险。建立AI诚实性的准则或法律需要开展大量工作:(1) 明确查明真实性标准;(2) 建立能够判断这些标准遵守情况的机构;(3) 建立坚定真实的AI系统。 我们最初提出的这些领域的建议包括:(1) 避免“极端的谎言”的标准(对谎言的概括性比较容易评估);(2) 机构在现实世界部署之前和之后评价AI系统,以及减少最坏的AI系统的风险。 明确培训AI系统,可能最终通过诚实性的标准,通过透明性机制实现。