大语言模型,作为法理学:关于通过法律标准与人工情报进行积极交流的案例研究 (Large Language Models as Fiduciaries: A Case Study Toward Robustly Communicating With Artificial Intelligence Through Legal Standards)

Artificial Intelligence (AI) is taking on increasingly autonomous roles, e.g., browsing the web as a research assistant and managing money. But specifying goals and restrictions for AI behavior is difficult. Similar to how parties to a legal contract cannot foresee every potential "if-then" contingency of their future relationship, we cannot specify desired AI behavior for all circumstances. Legal standards facilitate robust communication of inherently vague and underspecified goals. Instructions (in the case of language models, "prompts") that employ legal standards will allow AI agents to develop shared understandings of the spirit of a directive that generalize expectations regarding acceptable actions to take in unspecified states of the world. Standards have built-in context that is lacking from other goal specification languages, such as plain language and programming languages. Through an empirical study on thousands of evaluation labels we constructed from U.S. court opinions, we demonstrate that large language models (LLMs) are beginning to exhibit an "understanding" of one of the most relevant legal standards for AI agents: fiduciary obligations. Performance comparisons across models suggest that, as LLMs continue to exhibit improved core capabilities, their legal standards understanding will also continue to improve. OpenAI's latest LLM has 78% accuracy on our data, their previous release has 73% accuracy, and a model from their 2020 GPT-3 paper has 27% accuracy (worse than random). Our research is an initial step toward a framework for evaluating AI understanding of legal standards more broadly, and for conducting reinforcement learning with legal feedback (RLLF).

翻译：人工智能(AI)正在发挥日益自主的作用,例如,作为研究助理浏览网络并管理资金。但是,很难为AI行为规定目标和限制。类似地,法律合同当事方如何无法预见其未来关系中每一种潜在的“如果”意外情况,我们无法针对所有情况确定希望的AI行为。法律标准便利了内在模糊和未明确规定的目标的强有力交流。使用法律标准的指示(在语言模型中,“快速”将允许AI代理商对一项指示的精神形成共同的理解,该指示将可接受的行动的期望在未具体说明的世界状态中普遍化。标准是在其他目标规格语言(如普通语言和编程语言)所缺乏的背景下建立的。通过对美国法院意见中数千个评价标签的实证研究,我们证明大型语言模型(LLM)开始展示一个最相关的法律标准之一的“缺陷 ” : 信用义务。不同模型的绩效比较表明,由于LMS继续展示提高核心能力,其法律标准从其他目标规格语言(如普通语言和编程语言)到法律标准(LLM)的准确性数据将持续到我们过去27 %的准确性研究。OA的准确性数据将持续显示其更新的准确性。