We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. An autoregressive large language model (OpenAI's text-davinci-003) determines if proposed U.S. Congressional bills are relevant to specific public companies and provides explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of novel ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model, which outperforms the baseline of predicting the most common outcome of irrelevance. We also benchmark the performance of the previous OpenAI GPT-3 model (text-davinci-002), which was the state-of-the-art model on many academic natural language tasks until text-davinci-003 was recently released. The performance of text-davinci-002 is worse than a simple benchmark. These results suggest that, as large language models continue to exhibit improved natural language understanding capabilities, performance on lobbying related tasks will continue to improve. Longer-term, if AI begins to influence law in a manner that is not a direct extension of human intentions, this threatens the critical role that law as information could play in aligning AI with humans. Initially, AI is being used to simply augment human lobbyists for a small portion of their daily tasks. However, firms have an incentive to use less and less human oversight over automated assessments of policy ideas and the written communication to regulatory agencies and Congressional staffers. The core question raised is where to draw the line between human-driven and AI-driven policy influence.
翻译:我们展示了一个大型语言模式的证明概念。一个自动递增的大型语言模式(OpenAI's text-davinci-003)决定了拟议的美国国会法案是否与特定公营公司相关,并提供解释和信任程度。对于该法案,模型向该法案的发起人起草一封信,试图说服国会议员修改拟议的立法。我们使用数百个关于法案与公司基准测试该模式业绩的新地面真相标签(OpenAI's text-davinci-003),该模式超过了预测与不相干结果最为常见的货币联盟基准。我们还将先前的OpenAI GPT-3(tle-davinci-002)模式(Text-davinci-002)的绩效作为基准,该模式是许多学术性自然语言任务的最新模式,直到最近发布文本Davinici-003。文本-devinici-002的绩效比一个简单的基准要差得多。这些结果表明,随着大型语言模式继续展示改进了自然语言理解能力,在游说相关政策的扩展工作中的表现将比重, AI-LILI 成为了人类法律的动力。如果在法律中继续使用,那么,那么, 人类法律的稳定性与人类的稳定性的动力,那么,那么, 人类的稳定性-ILLILILI-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I