Building artificial intelligence (AI) that aligns with human values is an unsolved problem. Here, we developed a human-in-the-loop research pipeline called Democratic AI, in which reinforcement learning is used to design a social mechanism that humans prefer by majority. A large group of humans played an online investment game that involved deciding whether to keep a monetary endowment or to share it with others for collective benefit. Shared revenue was returned to players under two different redistribution mechanisms, one designed by the AI and the other by humans. The AI discovered a mechanism that redressed initial wealth imbalance, sanctioned free riders, and successfully won the majority vote. By optimizing for human preferences, Democratic AI may be a promising method for value-aligned policy innovation.
翻译:建设符合人类价值观的人工智能(AI)是一个尚未解决的问题。 在这里,我们开发了一个名为民主AI(Democratic AI)的“人与人之间流动研究管道 ”, 在其中,强化学习被用来设计一种人类以多数为首的社会机制。 一大群人玩了一个在线投资游戏,它涉及到决定是保留一种货币捐赠,还是与他人分享,以集体受益。 共享收入被返还给了两个不同的再分配机制下的参与者,一个是由AI设计的,另一个是由人类设计的。 大赦国际发现了一个机制,它纠正了最初的财富不平衡,认可了自由骑手,并成功地赢得了多数人的选票。 通过优化人类的偏好,民主AI(Democratic AI)可能是实现价值一致政策创新的有希望的方法。