大语言模型可以很容易地被无关紧要的背景所忽略 (Large Language Models Can Be Easily Distracted by Irrelevant Context) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · INFORMS · Performer · Prompt ·

2023 年 2 月 13 日

Large Language Models Can Be Easily Distracted by Irrelevant Context

翻译：大语言模型可以很容易地被无关紧要的背景所忽略

Freda Shi,Xinyun Chen,Kanishka Misra,Nathan Scales,David Dohan,Ed Chi,Nathanael Schärli,Denny Zhou

Large language models have achieved impressive performance on various natural language processing tasks. However, so far they have been evaluated primarily on benchmarks where all information in the input context is relevant for solving the task. In this work, we investigate the distractibility of large language models, i.e., how the model problem-solving accuracy can be influenced by irrelevant context. In particular, we introduce Grade-School Math with Irrelevant Context (GSM-IC), an arithmetic reasoning dataset with irrelevant information in the problem description. We use this benchmark to measure the distractibility of cutting-edge prompting techniques for large language models, and find that the model performance is dramatically decreased when irrelevant information is included. We also identify several approaches for mitigating this deficiency, such as decoding with self-consistency and adding to the prompt an instruction that tells the language model to ignore the irrelevant information.

翻译：大型语言模型在各种自然语言处理任务上取得了令人印象深刻的成绩,但迄今为止,这些模型主要是在与解决任务相关的投入背景下的所有信息都相关的基准上进行评估。在这项工作中,我们调查大型语言模型的可转移性,即解决问题模型的准确性如何受到不相关背景的影响。特别是,我们引入了具有与不相关背景的高中数学(GSM-IC),这是一个算术推理数据集,其中含有问题描述中不相关的信息。我们使用这一基准来衡量大语言模型尖端快速技术的可转移性,发现在纳入不相关信息时模型性能会大大下降。我们还确定了减轻这一缺陷的若干办法,例如与自相矛盾的解码,并添加一项及时指示,指示语言模型忽略不相关信息。

0

相关内容

语言模型化

语言模型化

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

专知

18+阅读 · 2018年2月22日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

EADIA调节抑癌基因DCC凋亡通路的分子机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

RIP1/RIP3通路调控脑出血后细胞程序性坏死的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

LIMD1调控非小细胞肺癌放疗敏感性的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

三氧化二砷在TBLR1-RARα阳性急性早幼粒细胞白血病分化和凋亡中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

纳米结构核材料离子束辐照的三维蒙特卡洛模拟

国家自然科学基金

0+阅读 · 2014年12月31日

脂肪细胞来源Microparticles介导新生血管生成在2型糖尿病易损斑块形成中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

IRF-1调控肺泡巨噬细胞焦亡在急性肺损伤中的作用及信号机制

国家自然科学基金

0+阅读 · 2014年12月31日

Nedd4泛素连接酶小分子化合物抑制剂的筛选和抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

长链非编码RNA-uc002mbe.2介导的HDACi凋亡效应及其在肝癌中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

小分子RNA在心肌损伤中作用及机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

A Method for Evaluating Deep Generative Models of Images via Assessing the Reproduction of High-order Spatial Context

Arxiv

0+阅读 · 2023年3月31日

A Survey of Large Language Models

A Survey of Large Language Models

Arxiv

476+阅读 · 2023年3月31日

Pair Programming with Large Language Models for Sampling and Estimation of Copulas

Arxiv

0+阅读 · 2023年3月31日

The Topology-Overlap Trade-Off in Retinal Arteriole-Venule Segmentation

Arxiv

0+阅读 · 2023年3月31日

Fairness-guided Few-shot Prompting for Large Language Models

Arxiv

0+阅读 · 2023年3月31日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning

Arxiv

21+阅读 · 2018年12月25日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

【论文推荐】最新六篇自动问答（QA）相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体

专知

18+阅读 · 2018年2月22日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

A Method for Evaluating Deep Generative Models of Images via Assessing the Reproduction of High-order Spatial Context

Arxiv

0+阅读 · 2023年3月31日

A Survey of Large Language Models

A Survey of Large Language Models

Arxiv

476+阅读 · 2023年3月31日

Pair Programming with Large Language Models for Sampling and Estimation of Copulas

Arxiv

0+阅读 · 2023年3月31日

The Topology-Overlap Trade-Off in Retinal Arteriole-Venule Segmentation

Arxiv

0+阅读 · 2023年3月31日

Fairness-guided Few-shot Prompting for Large Language Models

Arxiv

0+阅读 · 2023年3月31日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning

Arxiv

21+阅读 · 2018年12月25日

相关基金

EADIA调节抑癌基因DCC凋亡通路的分子机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

RIP1/RIP3通路调控脑出血后细胞程序性坏死的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

LIMD1调控非小细胞肺癌放疗敏感性的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

三氧化二砷在TBLR1-RARα阳性急性早幼粒细胞白血病分化和凋亡中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

纳米结构核材料离子束辐照的三维蒙特卡洛模拟

国家自然科学基金

0+阅读 · 2014年12月31日

脂肪细胞来源Microparticles介导新生血管生成在2型糖尿病易损斑块形成中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

IRF-1调控肺泡巨噬细胞焦亡在急性肺损伤中的作用及信号机制

国家自然科学基金

0+阅读 · 2014年12月31日

Nedd4泛素连接酶小分子化合物抑制剂的筛选和抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

长链非编码RNA-uc002mbe.2介导的HDACi凋亡效应及其在肝癌中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

小分子RNA在心肌损伤中作用及机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员