Artificial intelligence (AI) systems will increasingly be used to cause harm as they grow more capable. In fact, AI systems are already starting to be used to automate fraudulent activities, violate human rights, create harmful fake images, and identify dangerous toxins. To prevent some misuses of AI, we argue that targeted interventions on certain capabilities will be warranted. These restrictions may include controlling who can access certain types of AI models, what they can be used for, whether outputs are filtered or can be traced back to their user, and the resources needed to develop them. We also contend that some restrictions on non-AI capabilities needed to cause harm will be required. Though capability restrictions risk reducing use more than misuse (facing an unfavorable Misuse-Use Tradeoff), we argue that interventions on capabilities are warranted when other interventions are insufficient, the potential harm from misuse is high, and there are targeted ways to intervene on capabilities. We provide a taxonomy of interventions that can reduce AI misuse, focusing on the specific steps required for a misuse to cause harm (the Misuse Chain), and a framework to determine if an intervention is warranted. We apply this reasoning to three examples: predicting novel toxins, creating harmful images, and automating spear phishing campaigns.
翻译:人工智能(AI)系统随着能力的增长,将越来越多地被用来造成伤害。事实上,人工智能系统已开始被用来使欺诈活动自动化,侵犯人权,制造有害的假图像,并识别危险的毒素。为了防止滥用AI,我们争辩说,对某些能力进行有针对性的干预是有道理的。这些限制可能包括控制谁能够使用某些类型的AI模型,这些模型可以用于哪些方面,产出是否经过过滤或可追溯到用户,以及开发这些模型所需的资源。我们还争辩说,需要限制非AI能力以造成伤害。虽然能力限制比滥用(造成不受欢迎的滥用交易)更有可能减少使用能力,但我们认为,在其他干预不足的情况下,有必要干预能力,滥用的潜在伤害很大,并且有针对性地干预能力。我们提供了一个干预措施的分类,可以减少人工智能的滥用,侧重于滥用所需的具体步骤(滥用链),以及确定干预是否必要的框架。我们将这一推理应用到三个例子:预测新的毒素、创建有害的图像和自动标榜。</s>