带未预测表数据的微小照片的适应工作 (Few-shot Adaptation Works with UnpredicTable Data) - 专知论文

会员服务 ·

0

Performer · 小样本学习 · 数据集 · 多样性 · 语言模型化 ·

2022 年 8 月 8 日

Few-shot Adaptation Works with UnpredicTable Data

翻译：带未预测表数据的微小照片的适应工作

Jun Shern Chan,Michael Pieler,Jonathan Jao,Jérémy Scheurer,Ethan Perez

from arxiv, Code at https://github.com/JunShern/few-shot-adaptation

Prior work on language models (LMs) shows that training on a large number of diverse tasks improves few-shot learning (FSL) performance on new tasks. We take this to the extreme, automatically extracting 413,299 tasks from internet tables - orders of magnitude more than the next-largest public datasets. Finetuning on the resulting dataset leads to improved FSL performance on Natural Language Processing (NLP) tasks, but not proportionally to dataset scale. In fact, we find that narrow subsets of our dataset sometimes outperform more diverse datasets. For example, finetuning on software documentation from support.google.com raises FSL performance by a mean of +7.5% on 52 downstream tasks, which beats training on 40 human-curated NLP datasets (+6.7%). Finetuning on various narrow datasets leads to similar broad improvements across test tasks, suggesting that the gains are not from domain adaptation but adapting to FSL in general. We do not observe clear patterns between the datasets that lead to FSL gains, leaving open questions about why certain data helps with FSL.

翻译：语言模型(LMS)先前的工作显示,对大量不同任务的培训可以提高新任务的微小学习(FSL)绩效。我们将此推向极端,从互联网表格中自动提取413,299项任务 -- -- 比下一个最大的公共数据集高出数量级。由此得出的数据集的微调可以改进FSL在自然语言处理(NLP)任务上的性能,但与数据设定规模不相称。事实上,我们发现,我们数据集的狭小子集有时比更多样化的数据集表现得更好。例如,从支持.google.com对软件文档进行微调,使FSL在52个下游任务上以+7.5%的平均值提高FSL的性能,这比对40个人为的NLP数据集的培训(+6.7%)要强。对各种狭窄的数据集的微调可以使测试任务得到类似的广泛改进,表明收益并非来自领域调整,而是一般地适应FSFSL。我们没有看到导致FSL成果的清晰模式。

0

相关内容

Performer

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

功能化CdTe量子点生物效应及其机制热动力学研究

国家自然科学基金

0+阅读 · 2015年12月31日

Cu基催化剂作用下甲烷-合成气定向合成乙醇的构效关系

国家自然科学基金

0+阅读 · 2014年12月31日

Klotho对AD神经血管单元的调控机制及川芎苯酞类化合物的干预作用

国家自然科学基金

0+阅读 · 2014年12月31日

Riemann-Hilbert 方法和随机矩阵谱分析中的 Painleve 渐近

国家自然科学基金

0+阅读 · 2012年12月31日

REMg2TMx型多相合金的吸/放氢行为和衰减机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

过渡金属及其合金团簇的稳定性和磁性研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于代谢组学方法的羌活药材代谢表型及其与品质的相关性研究

国家自然科学基金

0+阅读 · 2009年12月31日

车用Ad Hoc网络的隐私与安全技术研究

国家自然科学基金

1+阅读 · 2008年12月31日

多尺度类钙钛矿化合物的合成及磁电耦合效应研究

国家自然科学基金

0+阅读 · 2008年12月31日

新型手性N-Oxide金属化合物的合成与催化研究

国家自然科学基金

0+阅读 · 2008年12月31日

Improving the Sample Efficiency of Prompt Tuning with Domain Adaptation

Arxiv

0+阅读 · 2022年10月6日

WUDA: Unsupervised Domain Adaptation Based on Weak Source Domain Labels

Arxiv

0+阅读 · 2022年10月5日

Robust Target Training for Multi-Source Domain Adaptation

Arxiv

0+阅读 · 2022年10月4日

Identifying Latent Causal Content for Multi-Source Domain Adaptation

Arxiv

0+阅读 · 2022年9月30日

Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models

Arxiv

0+阅读 · 2022年9月30日

What Makes Pre-trained Language Models Better Zero/Few-shot Learners?

Arxiv

0+阅读 · 2022年9月30日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Adaptive Methods for Real-World Domain Generalization

Arxiv

13+阅读 · 2021年3月29日

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Arxiv

23+阅读 · 2021年3月3日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

18+阅读 · 2019年10月30日

VIP会员

文章信息

相关主题

小样本学习

语言模型化

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

网络科学赋能人工智能: 现状与展望

【NeurIPS2025教程】解释人工智能模型：可解释人工智能、数据中心人工智能与机制可解释性的方法与机遇

人工智能赋能作战行动：以俄乌战争为例

【ETHZ博士论文】表征学习在推进深度学习中的作用：效率、可扩展性与推理

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Improving the Sample Efficiency of Prompt Tuning with Domain Adaptation

Arxiv

0+阅读 · 2022年10月6日

WUDA: Unsupervised Domain Adaptation Based on Weak Source Domain Labels

Arxiv

0+阅读 · 2022年10月5日

Robust Target Training for Multi-Source Domain Adaptation

Arxiv

0+阅读 · 2022年10月4日

Identifying Latent Causal Content for Multi-Source Domain Adaptation

Arxiv

0+阅读 · 2022年9月30日

Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models

Arxiv

0+阅读 · 2022年9月30日

What Makes Pre-trained Language Models Better Zero/Few-shot Learners?

Arxiv

0+阅读 · 2022年9月30日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Adaptive Methods for Real-World Domain Generalization

Arxiv

13+阅读 · 2021年3月29日

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Arxiv

23+阅读 · 2021年3月3日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

18+阅读 · 2019年10月30日

相关基金

功能化CdTe量子点生物效应及其机制热动力学研究

国家自然科学基金

0+阅读 · 2015年12月31日

Cu基催化剂作用下甲烷-合成气定向合成乙醇的构效关系

国家自然科学基金

0+阅读 · 2014年12月31日

Klotho对AD神经血管单元的调控机制及川芎苯酞类化合物的干预作用

国家自然科学基金

0+阅读 · 2014年12月31日

Riemann-Hilbert 方法和随机矩阵谱分析中的 Painleve 渐近

国家自然科学基金

0+阅读 · 2012年12月31日

REMg2TMx型多相合金的吸/放氢行为和衰减机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

过渡金属及其合金团簇的稳定性和磁性研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于代谢组学方法的羌活药材代谢表型及其与品质的相关性研究

国家自然科学基金

0+阅读 · 2009年12月31日

车用Ad Hoc网络的隐私与安全技术研究

国家自然科学基金

1+阅读 · 2008年12月31日

多尺度类钙钛矿化合物的合成及磁电耦合效应研究

国家自然科学基金

0+阅读 · 2008年12月31日

新型手性N-Oxide金属化合物的合成与催化研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员