大语言模型,作为法理学:关于通过法律标准与人工情报进行积极交流的案例研究 (Large Language Models as Fiduciaries: A Case Study Toward Robustly Communicating With Artificial Intelligence Through Legal Standards)

Artificial Intelligence (AI) is taking on increasingly autonomous roles, e.g., browsing the web as a research assistant and managing money. But specifying goals and restrictions for AI behavior is difficult. Similar to how parties to a legal contract cannot foresee every potential "if-then" contingency of their future relationship, we cannot specify desired AI behavior for all circumstances. Legal standards facilitate the robust communication of inherently vague and underspecified goals. Instructions (in the case of language models, "prompts") that employ legal standards will allow AI agents to develop shared understandings of the spirit of a directive that can adapt to novel situations, and generalize expectations regarding acceptable actions to take in unspecified states of the world. Standards have built-in context that is lacking from other goal specification languages, such as plain language and programming languages. Through an empirical study on thousands of evaluation labels we constructed from U.S. court opinions, we demonstrate that large language models (LLMs) are beginning to exhibit an "understanding" of one of the most relevant legal standards for AI agents: fiduciary obligations. Performance comparisons across models suggest that, as LLMs continue to exhibit improved core capabilities, their legal standards understanding will also continue to improve. OpenAI's latest LLM has 78% accuracy on our data, their previous release has 73% accuracy, and a model from their 2020 GPT-3 paper has 27% accuracy (worse than random). Our research is an initial step toward a framework for evaluating AI understanding of legal standards more broadly, and for conducting reinforcement learning with legal feedback (RLLF).

翻译：人工智能(AI)正在发挥日益自主的作用,例如,作为研究助理浏览网络并管理资金。但是,很难为AI行为规定目标和限制。类似地,法律合同当事方如何无法预见其未来关系中每一个潜在的“如果”意外情况,我们无法针对所有情况确定希望的AI行为。法律标准便利了内在模糊和未明确规定的目标的强有力交流。使用法律标准的指令(语言模型,“快速”)将使AI代理商能够对一项指令的精神形成共同的反馈,该指令可以广泛适应新情况,并普遍实现对可接受行动的期望,以便在未说明的世界状态中采取。标准是建立在法律合同合同缔约方无法预见的情形下,例如普通语言和编程语言。通过对我们从美国法院意见中构建的数千个评价标签进行实证研究,我们证明大型语言模型(LLM)开始展示一个与AI代理商最相关的法律标准之一的“超标 ” : 信用义务。不同模型的绩效比较表明,由于LM继续从法律模型到更新其核心精确性能力,其法律标准也将显示其以往的准确性水平将提高。