让我们保持安全:设计用户界面,让每个人都能为 AI 安全做出贡献 (Let's Keep It Safe: Designing User Interfaces that Allow Everyone to Contribute to AI Safety)

from arxiv, The full journal version of this article (published in Proceedings of the ACM on Human-Computer Interaction 4, CSCW2) can be found at https://dl.acm.org/doi/10.1145/3415168. The article is public access

When AI systems are granted the agency to take impactful actions in the real world, there is an inherent risk that these systems behave in ways that are harmful. Typically, humans specify constraints on the AI system to prevent harmful behavior; however, very little work has studied how best to facilitate this difficult constraint specification process. In this paper, we study how to design user interfaces that make this process more effective and accessible, allowing people with a diversity of backgrounds and levels of expertise to contribute to this task. We first present a task design in which workers evaluate the safety of individual state-action pairs, and propose several variants of this task with improved task design and filtering mechanisms. Although this first design is easy to understand, it scales poorly to large state spaces. Therefore, we develop a new user interface that allows workers to write constraint rules without any programming. Despite its simplicity, we show that our rule construction interface retains full expressiveness. We present experiments utilizing crowdworkers to help address an important real-world AI safety problem in the domain of education. Our results indicate that our novel worker filtering and explanation methods outperform baseline approaches, and our rule-based interface allows workers to be much more efficient while improving data quality.

翻译：当AI系统被授予在现实世界中采取具有影响的行动的机构时,这些系统有内在的风险,这些系统的行为方式是有害的。通常,人类具体规定了对AI系统的制约,以防止有害的行为;然而,几乎没有研究如何最好地促进这一困难的制约规格进程。在本文件中,我们研究如何设计用户界面,使这一进程更加有效和便于使用,使具有不同背景和专门知识水平的人能够为这项任务作出贡献。我们首先提出任务设计,让工人评估个别国家行动对口的安全,并提议几项任务变式,改进任务设计和过滤机制。虽然第一次设计容易理解,但规模小于大型国家空间。因此,我们开发了一个新的用户界面,允许工人在没有任何程序的情况下撰写约束规则规则规则规则规则规则规则规则。尽管简单,但我们表明我们的建筑界面保持充分清晰的清晰性。我们提出实验,利用人群工人帮助解决教育领域中一个重要的现实世界的AI安全问题。我们的结果表明,我们的新工人过滤和解释方法超越了符合要求的基线方法,而我们基于规则的界面允许工人在改进数据质量的同时提高效率。