Atlas:利用检索增强语言模式学习少图 (Atlas: Few-shot Learning with Retrieval Augmented Language Models) - 专知论文

会员服务 ·

0

语言模型化 · 知识 (knowledge) · MoDELS · 小样本学习 · Learning ·

2022 年 11 月 16 日

Atlas: Few-shot Learning with Retrieval Augmented Language Models

翻译：Atlas:利用检索增强语言模式学习少图

Gautier Izacard,Patrick Lewis,Maria Lomeli,Lucas Hosseini,Fabio Petroni,Timo Schick,Jane Dwivedi-Yu,Armand Joulin,Sebastian Riedel,Edouard Grave

Large language models have shown impressive few-shot results on a wide range of tasks. However, when knowledge is key for such results, as is the case for tasks such as question answering and fact checking, massive parameter counts to store knowledge seem to be needed. Retrieval augmented models are known to excel at knowledge intensive tasks without the need for as many parameters, but it is unclear whether they work in few-shot settings. In this work we present Atlas, a carefully designed and pre-trained retrieval augmented language model able to learn knowledge intensive tasks with very few training examples. We perform evaluations on a wide range of tasks, including MMLU, KILT and NaturalQuestions, and study the impact of the content of the document index, showing that it can easily be updated. Notably, Atlas reaches over 42% accuracy on Natural Questions using only 64 examples, outperforming a 540B parameters model by 3% despite having 50x fewer parameters.

翻译：大型语言模型在一系列广泛的任务中显示出了令人印象深刻的微小结果。然而,当知识是这种结果的关键时,例如问题回答和事实调查等任务,似乎需要大量的参数来储存知识。检索扩展模型被认为在知识密集型任务方面非常出色,不需要同样多的参数,但尚不清楚它们是否在几眼环境中发挥作用。在这项工作中,我们介绍Atlas系统是一个经过仔细设计和预先训练的检索增强语言模型,能够以极少的培训实例来学习知识密集型任务。我们对包括MMMLU、KILT和自然问题在内的广泛任务进行评估,并研究文件索引内容的影响,表明它很容易更新。值得注意的是,Atlas系统仅使用64个实例,在自然问题上达到42%的精确度,比540B参数模型高出3%,尽管参数减少了50倍。

0

相关内容

语言模型化

语言模型化

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

35+阅读 · 2022年3月5日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

AML1-Evi1基因致急性髓系白血病的分子机制研究（斑马鱼模型）

国家自然科学基金

0+阅读 · 2014年12月31日

Berezin变换及相关的算子理论

国家自然科学基金

1+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

多任务学习的理论分析与应用

国家自然科学基金

6+阅读 · 2013年12月31日

与薛定鄂算子和多线性算子相关问题

国家自然科学基金

0+阅读 · 2012年12月31日

最优传输问题与随机矩阵

国家自然科学基金

3+阅读 · 2012年12月31日

Alpha突触核蛋白调节的NLRP3炎症小体在MPTP/p帕金森症模型中的研究

国家自然科学基金

0+阅读 · 2012年12月31日

冰云辐射性质参数化对东亚夏季风模拟的影响研究

国家自然科学基金

0+阅读 · 2012年12月31日

堆石料流变的细观机理与粘弹塑性本构模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

非线性波方程行波系统的动力学性质和精确解研究

国家自然科学基金

0+阅读 · 2011年12月31日

Continual Few-Shot Learning Using HyperTransformers

Arxiv

0+阅读 · 2023年1月12日

HADA: A Graph-based Amalgamation Framework in Image-text Retrieval

Arxiv

1+阅读 · 2023年1月11日

A Survey of Knowledge-Enhanced Pre-trained Language Models

Arxiv

18+阅读 · 2022年11月17日

Updating Embeddings for Dynamic Knowledge Graphs

Arxiv

20+阅读 · 2021年9月22日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Learning Conceptual-Contexual Embeddings for Medical Text

Arxiv

27+阅读 · 2019年8月16日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

VIP会员

文章信息

相关主题

语言模型化

知识 (knowledge)

小样本学习

相关VIP内容

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

35+阅读 · 2022年3月5日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】语义提示扩散变换器的像素级精确深度估计

俄乌冲突的地缘政治与军事教训（万字长文）

【博士论文】弥合多模态基础模型与世界模型之间的鸿沟

量子增强计算机视觉：超越经典算法

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

相关论文

Continual Few-Shot Learning Using HyperTransformers

Arxiv

0+阅读 · 2023年1月12日

HADA: A Graph-based Amalgamation Framework in Image-text Retrieval

Arxiv

1+阅读 · 2023年1月11日

A Survey of Knowledge-Enhanced Pre-trained Language Models

Arxiv

18+阅读 · 2022年11月17日

Updating Embeddings for Dynamic Knowledge Graphs

Arxiv

20+阅读 · 2021年9月22日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Learning Conceptual-Contexual Embeddings for Medical Text

Arxiv

27+阅读 · 2019年8月16日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

相关基金

AML1-Evi1基因致急性髓系白血病的分子机制研究（斑马鱼模型）

国家自然科学基金

0+阅读 · 2014年12月31日

Berezin变换及相关的算子理论

国家自然科学基金

1+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

多任务学习的理论分析与应用

国家自然科学基金

6+阅读 · 2013年12月31日

与薛定鄂算子和多线性算子相关问题

国家自然科学基金

0+阅读 · 2012年12月31日

最优传输问题与随机矩阵

国家自然科学基金

3+阅读 · 2012年12月31日

Alpha突触核蛋白调节的NLRP3炎症小体在MPTP/p帕金森症模型中的研究

国家自然科学基金

0+阅读 · 2012年12月31日

冰云辐射性质参数化对东亚夏季风模拟的影响研究

国家自然科学基金

0+阅读 · 2012年12月31日

堆石料流变的细观机理与粘弹塑性本构模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

非线性波方程行波系统的动力学性质和精确解研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员