语言模型是否具有理性？——关于一致性规范与信念修正的案例研究 (Are language models rational? The case of coherence norms and belief revision)

Do norms of rationality apply to machine learning models, in particular language models? In this paper we investigate this question by focusing on a special subset of rational norms: coherence norms. We consider both logical coherence norms as well as coherence norms tied to the strength of belief. To make sense of the latter, we introduce the Minimal Assent Connection (MAC) and propose a new account of credence, which captures the strength of belief in language models. This proposal uniformly assigns strength of belief simply on the basis of model internal next token probabilities. We argue that rational norms tied to coherence do apply to some language models, but not to others. This issue is significant since rationality is closely tied to predicting and explaining behavior, and thus it is connected to considerations about AI safety and alignment, as well as understanding model behavior more generally.

翻译：理性规范是否适用于机器学习模型，特别是语言模型？本文通过聚焦于理性规范的一个特殊子集——一致性规范来探讨这一问题。我们同时考察逻辑一致性规范以及与信念强度相关的一致性规范。为理解后者，我们引入了最小认同关联（MAC）并提出一种新的置信度解释框架，该框架能够捕捉语言模型中的信念强度。该方案基于模型内部的下一个词元概率统一分配信念强度。我们认为，与一致性相关的理性规范适用于某些语言模型，但不适用于其他模型。这一问题具有重要意义，因为理性与行为预测和解释密切相关，因此涉及人工智能安全与对齐的考量，以及对模型行为的更广泛理解。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日