《大LMS社会情报的限度》 (Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs)

Social intelligence and Theory of Mind (ToM), i.e., the ability to reason about the different mental states, intents, and reactions of all people involved, allow humans to effectively navigate and understand everyday social interactions. As NLP systems are used in increasingly complex social situations, their ability to grasp social dynamics becomes crucial. In this work, we examine the open question of social intelligence and Theory of Mind in modern NLP systems from an empirical and theory-based perspective. We show that one of today's largest language models (GPT-3; Brown et al., 2020) lacks this kind of social intelligence out-of-the box, using two tasks: SocialIQa (Sap et al., 2019), which measures models' ability to understand intents and reactions of participants of social interactions, and ToMi (Le et al., 2019), which measures whether models can infer mental states and realities of participants of situations. Our results show that models struggle substantially at these Theory of Mind tasks, with well-below-human accuracies of 55% and 60% on SocialIQa and ToMi, respectively. To conclude, we draw on theories from pragmatics to contextualize this shortcoming of large language models, by examining the limitations stemming from their data, neural architecture, and training paradigms. Challenging the prevalent narrative that only scale is needed, we posit that person-centric NLP approaches might be more effective towards neural Theory of Mind.

翻译：社会智慧和思想理论(ToM),即了解不同心理状态、意图和所有参与者的反应的能力,使人类能够有效地引导和理解日常社会互动。随着NLP系统在日益复杂的社会状况中被使用,他们掌握社会动态的能力变得至关重要。在这项工作中,我们从经验和理论的角度研究现代NLP系统中社会智慧和思想理论的开放问题。我们显示,当今最大的语言模式之一(GPT-3;Brown等人,2020年)缺乏这种超越核心的社会智慧,使用两种任务:社会Qa(Sap等人,2019年),衡量模型理解社会互动参与者的意向和反应的能力,Tomi(Le等人,2019年),衡量模型能否从经验和基于理论的角度推断局势参与者的心理状态和现实。我们的结果显示,模型在思想任务的这些理论中挣扎得非常激烈,在常态的NPPIQa和ToMi中缺乏55和60 %的社会智能知识知识,而我们从这个常态的理论中,从实际的理论到实际的理论,我们从这种理论中可以得出结论。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日