We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. An autoregressive large language model (OpenAI's text-davinci-003) determines if proposed U.S. Congressional bills are relevant to specific public companies and provides explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model, which outperforms the baseline of predicting the most common outcome of irrelevance. We also benchmark the performance of the previous OpenAI GPT-3 model (text-davinci-002), which was state-of-the-art on many language tasks until text-davinci-003 was recently released. The performance of text-davinci-002 is worse than simply always predicting that a bill is irrelevant to a company. These results suggest that, as large language models continue to exhibit improved core natural language understanding capabilities, performance on corporate lobbying related tasks will continue to improve. If AI begins to influence law in a manner that is not a direct extension of human intentions, this threatens the critical role that law as information could play in aligning AI with humans. This paper explores how this is increasingly a possibility. Initially, AI is being used to simply augment human lobbyists. However, there may be a slow creep of less and less human oversight over automated assessments of policy ideas and the written communication to regulatory agencies and Congressional staffers. The core question raised is where to draw the line between human-driven and AI-driven policy influence.
翻译:我们对开展企业游说活动的大型语言模式的观念进行了证明。一个自动递减的大型语言模式(OpenAI的文本-davinci-003)决定了拟议的美国国会法案是否与特定公共公司相关,并提供了解释和信任程度。对于该模式认为相关的法案,该模式向该法案的发起人起草一封信,试图说服国会议员修改拟议的立法。我们使用数百个关于法案与公司基准测试该模式业绩的地面真相标签(OpenAI的文本-davinci-003),该模式超过了预测不相干的最常见结果的基线。我们还将先前的OpenAI GPT-3模式(文本-davinci-002)的绩效作为基准,该模式在文本-davinci-003发布之前是许多语言任务的最新版本。文本-davinci-002的绩效比简单地预测法案与公司的相关性要差得多。这些结果表明,随着大型语言模型继续展示出更精细的自然语言理解能力,在公司内部的监管政策中,在内部的动态上表现更不那么重要的行为会影响。