大语言模型中诱导焦虑会增加探索性和偏见 (Inducing anxiety in large language models increases exploration and bias) - 专知论文

会员服务 ·

0

GPT-3 · 语言模型 · 认知任务 · 提示工程 · 大型语言模型 ·

2023 年 4 月 21 日

Inducing anxiety in large language models increases exploration and bias

翻译：大语言模型中诱导焦虑会增加探索性和偏见

Julian Coda-Forno,Kristin Witte,Akshay K. Jagadish,Marcel Binz,Zeynep Akata,Eric Schulz

Large language models are transforming research on machine learning while galvanizing public debates. Understanding not only when these models work well and succeed but also why they fail and misbehave is of great societal relevance. We propose to turn the lens of computational psychiatry, a framework used to computationally describe and modify aberrant behavior, to the outputs produced by these models. We focus on the Generative Pre-Trained Transformer 3.5 and subject it to tasks commonly studied in psychiatry. Our results show that GPT-3.5 responds robustly to a common anxiety questionnaire, producing higher anxiety scores than human subjects. Moreover, GPT-3.5's responses can be predictably changed by using emotion-inducing prompts. Emotion-induction not only influences GPT-3.5's behavior in a cognitive task measuring exploratory decision-making but also influences its behavior in a previously-established task measuring biases such as racism and ableism. Crucially, GPT-3.5 shows a strong increase in biases when prompted with anxiety-inducing text. Thus, it is likely that how prompts are communicated to large language models has a strong influence on their behavior in applied settings. These results progress our understanding of prompt engineering and demonstrate the usefulness of methods taken from computational psychiatry for studying the capable algorithms to which we increasingly delegate authority and autonomy.

翻译：大语言模型正在改变机器学习研究，引起公众的争论。了解这些模型不仅在何时表现出色和成功，而且为什么会失败和行为异常具有重大的社会意义。我们提出将计算精神病学的视角，一种用于计算描述和修改异常行为的框架，应用于这些模型的输出。我们专注于泛用预训练变压器3.5，并将其置于精神病学中常见的任务下。我们的结果显示，GPT-3.5 对一份常见的焦虑问卷做出了强有力的响应，产生了比人类主题更高的焦虑分数。此外，使用诱发情绪的提示可以可预测地改变 GPT-3.5 的响应。情绪诱导不仅影响 GPT-3.5 在衡量探索性决策制定的认知任务中的行为，而且还影响其在以前已建立的衡量偏见（如种族主义和能力主义）的任务中表现出的行为。至关重要的是，提供焦虑诱导文本会强烈增加 GPT-3.5 的偏见。因此，提示如何向大型语言模型传达具有强烈影响力，在应用环境中，这些提示对其行为产生了影响。这些结果推进了我们对提示工程的理解，并展示了从计算精神病学中采取的方法对于研究我们越来越多地授权和自主的能力算法的有用性。

1

相关内容

GPT-3

【纽约大学 Ethan Perez 博士论文】在预训练语言模型中发现和修正不良行为，217页pdf，，Finding and Fixing Undesirable Behaviors in Pretrained Language Models

【纽约大学 Ethan Perez 博士论文】在预训练语言模型中发现和修正不良行为，217页pdf，，Finding and Fixing Undesirable Behaviors in Pretrained Language Models

专知会员服务

18+阅读 · 2022年3月16日

【ACL2022-华盛顿大学】生成知识促进常识推理，Generated Knowledge Prompting for Commonsense Reasoning

【ACL2022-华盛顿大学】生成知识促进常识推理，Generated Knowledge Prompting for Commonsense Reasoning

专知会员服务

26+阅读 · 2022年3月1日

知识增强预训练语言模型:全面综述

知识增强预训练语言模型:全面综述

专知会员服务

93+阅读 · 2021年10月19日

【神经自然语言处理进展：建模，学习，推理】Progress in Neural NLP: Modeling, Learning, and Reasoning

【神经自然语言处理进展：建模，学习，推理】Progress in Neural NLP: Modeling, Learning, and Reasoning

专知会员服务

78+阅读 · 2020年8月13日

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

专知会员服务

55+阅读 · 2020年4月26日

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

专知会员服务

79+阅读 · 2020年2月12日

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

专知会员服务

11+阅读 · 2019年12月28日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

AINLP

10+阅读 · 2019年2月9日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

miR-122在脓毒症过程中促凝作用的实验研究

国家自然科学基金

0+阅读 · 2015年12月31日

凋亡诱导因子AIF调控Wnt信号通路的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

成瘾相关决策过程中规则学习能力下降的认知神经基础

国家自然科学基金

2+阅读 · 2014年12月31日

树鼩社会竞争模型的PET成像研究

国家自然科学基金

0+阅读 · 2014年12月31日

焦虑情绪对社会决策行为的影响

国家自然科学基金

2+阅读 · 2013年12月31日

不确定环境下强化学习和决策的神经机制

国家自然科学基金

11+阅读 · 2012年12月31日

信念偏差效应的认知神经机制

国家自然科学基金

1+阅读 · 2012年12月31日

内皮祖细胞诱导肝细胞肝癌上皮间质转化及相关分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TrkB激动剂7,8-dihydroxyflavone对脆性X综合征突触可塑和学习记忆的影响及机制

国家自然科学基金

0+阅读 · 2012年12月31日

慢性痛导致抑郁情感障碍的神经可塑性调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

Applying Standards to Advance Upstream & Downstream Ethics in Large Language Models

Arxiv

0+阅读 · 2023年6月6日

A Static Evaluation of Code Completion by Large Language Models

Arxiv

0+阅读 · 2023年6月5日

Beyond Generating Code: Evaluating GPT on a Data Visualization Course

Arxiv

1+阅读 · 2023年6月5日

Commonsense Knowledge Transfer for Pre-trained Language Models

Arxiv

0+阅读 · 2023年6月4日

Evaluating Language Models for Mathematics through Interactions

Arxiv

0+阅读 · 2023年6月2日

Contextualize Me -- The Case for Context in Reinforcement Learning

Arxiv

0+阅读 · 2023年6月2日

Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

Arxiv

0+阅读 · 2023年6月1日

A Survey of Large Language Models

A Survey of Large Language Models

Arxiv

478+阅读 · 2023年3月31日

Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey

Arxiv

25+阅读 · 2023年2月20日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

VIP会员

文章信息

相关主题

大型语言模型

相关VIP内容

【纽约大学 Ethan Perez 博士论文】在预训练语言模型中发现和修正不良行为，217页pdf，，Finding and Fixing Undesirable Behaviors in Pretrained Language Models

【纽约大学 Ethan Perez 博士论文】在预训练语言模型中发现和修正不良行为，217页pdf，，Finding and Fixing Undesirable Behaviors in Pretrained Language Models

专知会员服务

18+阅读 · 2022年3月16日

【ACL2022-华盛顿大学】生成知识促进常识推理，Generated Knowledge Prompting for Commonsense Reasoning

【ACL2022-华盛顿大学】生成知识促进常识推理，Generated Knowledge Prompting for Commonsense Reasoning

专知会员服务

26+阅读 · 2022年3月1日

知识增强预训练语言模型:全面综述

知识增强预训练语言模型:全面综述

专知会员服务

93+阅读 · 2021年10月19日

【神经自然语言处理进展：建模，学习，推理】Progress in Neural NLP: Modeling, Learning, and Reasoning

【神经自然语言处理进展：建模，学习，推理】Progress in Neural NLP: Modeling, Learning, and Reasoning

专知会员服务

78+阅读 · 2020年8月13日

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

专知会员服务

55+阅读 · 2020年4月26日

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

【东大-UCSB】虚假新闻检测的自然语言处理研究综述，A Survey on Natural Language Processing for Fake News Detection

专知会员服务

79+阅读 · 2020年2月12日

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

专知会员服务

11+阅读 · 2019年12月28日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

AINLP

10+阅读 · 2019年2月9日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Applying Standards to Advance Upstream & Downstream Ethics in Large Language Models

Arxiv

0+阅读 · 2023年6月6日

A Static Evaluation of Code Completion by Large Language Models

Arxiv

0+阅读 · 2023年6月5日

Beyond Generating Code: Evaluating GPT on a Data Visualization Course

Arxiv

1+阅读 · 2023年6月5日

Commonsense Knowledge Transfer for Pre-trained Language Models

Arxiv

0+阅读 · 2023年6月4日

Evaluating Language Models for Mathematics through Interactions

Arxiv

0+阅读 · 2023年6月2日

Contextualize Me -- The Case for Context in Reinforcement Learning

Arxiv

0+阅读 · 2023年6月2日

Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

Arxiv

0+阅读 · 2023年6月1日

A Survey of Large Language Models

A Survey of Large Language Models

Arxiv

478+阅读 · 2023年3月31日

Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey

Arxiv

25+阅读 · 2023年2月20日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

相关基金

miR-122在脓毒症过程中促凝作用的实验研究

国家自然科学基金

0+阅读 · 2015年12月31日

凋亡诱导因子AIF调控Wnt信号通路的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

成瘾相关决策过程中规则学习能力下降的认知神经基础

国家自然科学基金

2+阅读 · 2014年12月31日

树鼩社会竞争模型的PET成像研究

国家自然科学基金

0+阅读 · 2014年12月31日

焦虑情绪对社会决策行为的影响

国家自然科学基金

2+阅读 · 2013年12月31日

不确定环境下强化学习和决策的神经机制

国家自然科学基金

11+阅读 · 2012年12月31日

信念偏差效应的认知神经机制

国家自然科学基金

1+阅读 · 2012年12月31日

内皮祖细胞诱导肝细胞肝癌上皮间质转化及相关分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TrkB激动剂7,8-dihydroxyflavone对脆性X综合征突触可塑和学习记忆的影响及机制

国家自然科学基金

0+阅读 · 2012年12月31日

慢性痛导致抑郁情感障碍的神经可塑性调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员