较大型的未受过训练的语文模式统一起来是否更好? (Are Larger Pretrained Language Models Uniformly Better? Comparing Performance at the Instance Level) - 专知论文

会员服务 ·

0

语言模型化 · 模型评估 · Better · MoDELS · Performer ·

2021 年 5 月 13 日

Are Larger Pretrained Language Models Uniformly Better? Comparing Performance at the Instance Level

翻译：较大型的未受过训练的语文模式统一起来是否更好?

Ruiqi Zhong,Dhruba Ghosh,Dan Klein,Jacob Steinhardt

from arxiv, ACL 2021 Findings. Code and data: https://github.com/ruiqi-zhong/acl2021-instance-level

Larger language models have higher accuracy on average, but are they better on every single instance (datapoint)? Some work suggests larger models have higher out-of-distribution robustness, while other work suggests they have lower accuracy on rare subgroups. To understand these differences, we investigate these models at the level of individual instances. However, one major challenge is that individual predictions are highly sensitive to noise in the randomness in training. We develop statistically rigorous methods to address this, and after accounting for pretraining and finetuning noise, we find that our BERT-Large is worse than BERT-Mini on at least 1-4% of instances across MNLI, SST-2, and QQP, compared to the overall accuracy improvement of 2-10%. We also find that finetuning noise increases with model size and that instance-level accuracy has momentum: improvement from BERT-Mini to BERT-Medium correlates with improvement from BERT-Medium to BERT-Large. Our findings suggest that instance-level predictions provide a rich source of information; we therefore, recommend that researchers supplement model weights with model predictions.

翻译：较大的语言模型平均具有较高的准确性,但在每个实例(数据点)中,它们是否更好?有些工作表明,较大的模型在分配上具有较高的稳健性,而另一些工作则表明,在稀有分组中,这些模型的准确性较低。为了理解这些差异,我们在个别实例中对这些模型进行了调查;然而,一个重大挑战是,单个预测对随机性中的噪音非常敏感。我们制定了在统计上严格的方法来解决这个问题,在计算了培训前和微调噪音之后,我们发现我们的BERT-Large在MNLI、SST-2和QP至少1-4%的案例中比BERT-Mini要差,而总体精确度则提高2-10%。我们还发现,微调噪音与模型大小相比,这种实例级的准确性具有动力:从BERT-Mini到BERT-Medium的改进与BERT-Mium的改进有关。我们发现,实例级预测提供了丰富的信息来源;因此,我们建议研究人员用模型预测补充模型重量。

1

相关内容

语言模型化

语言模型化

Google-EfficientNet v2来了！更快，更小，更强！

Google-EfficientNet v2来了！更快，更小，更强！

专知会员服务

19+阅读 · 2021年4月4日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

【优化基准：最佳实践，54页pdf】Benchmarking in Optimization: Best Practice and Open Issues

【优化基准：最佳实践，54页pdf】Benchmarking in Optimization: Best Practice and Open Issues

专知会员服务

25+阅读 · 2020年7月28日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【CVPR2020-普林斯顿】自监督预训练对于视觉任务到底有什么用？ Self-Supervised Pretraining

【CVPR2020-普林斯顿】自监督预训练对于视觉任务到底有什么用？ Self-Supervised Pretraining

专知会员服务

24+阅读 · 2020年4月1日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

鲁棒机器学习相关文献集

鲁棒机器学习相关文献集

专知

8+阅读 · 2019年8月18日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

4+阅读 · 2018年11月6日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding

Arxiv

3+阅读 · 2021年7月1日

Self-supervised Learning: Generative or Contrastive

Arxiv

25+阅读 · 2021年3月20日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

How Useful is Self-Supervised Pretraining for Visual Tasks?

How Useful is Self-Supervised Pretraining for Visual Tasks?

Arxiv

9+阅读 · 2020年3月31日

Multi-Stage Document Ranking with BERT

Arxiv

5+阅读 · 2019年10月31日

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Arxiv

5+阅读 · 2019年9月26日

A Unified Model for Joint Chinese Word Segmentation and Dependency Parsing

Arxiv

4+阅读 · 2019年4月9日

A Comprehensive Comparison of Unsupervised Network Representation Learning Methods

Arxiv

5+阅读 · 2019年3月19日

Pay More Attention - Neural Architectures for Question-Answering

Arxiv

5+阅读 · 2018年3月25日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

Google-EfficientNet v2来了！更快，更小，更强！

Google-EfficientNet v2来了！更快，更小，更强！

专知会员服务

19+阅读 · 2021年4月4日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

【优化基准：最佳实践，54页pdf】Benchmarking in Optimization: Best Practice and Open Issues

【优化基准：最佳实践，54页pdf】Benchmarking in Optimization: Best Practice and Open Issues

专知会员服务

25+阅读 · 2020年7月28日

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

【微软】大型神经语言模型的对抗性训练，Adversarial Training for Large Neural Language Models

专知会员服务

51+阅读 · 2020年5月3日

【CVPR2020-普林斯顿】自监督预训练对于视觉任务到底有什么用？ Self-Supervised Pretraining

【CVPR2020-普林斯顿】自监督预训练对于视觉任务到底有什么用？ Self-Supervised Pretraining

专知会员服务

24+阅读 · 2020年4月1日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

鲁棒机器学习相关文献集

鲁棒机器学习相关文献集

专知

8+阅读 · 2019年8月18日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

4+阅读 · 2018年11月6日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding

Arxiv

3+阅读 · 2021年7月1日

Self-supervised Learning: Generative or Contrastive

Arxiv

25+阅读 · 2021年3月20日

Pretrained Transformers for Text Ranking: BERT and Beyond

Arxiv

28+阅读 · 2020年10月13日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

How Useful is Self-Supervised Pretraining for Visual Tasks?

How Useful is Self-Supervised Pretraining for Visual Tasks?

Arxiv

9+阅读 · 2020年3月31日

Multi-Stage Document Ranking with BERT

Arxiv

5+阅读 · 2019年10月31日

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Arxiv

5+阅读 · 2019年9月26日

A Unified Model for Joint Chinese Word Segmentation and Dependency Parsing

Arxiv

4+阅读 · 2019年4月9日

A Comprehensive Comparison of Unsupervised Network Representation Learning Methods

Arxiv

5+阅读 · 2019年3月19日

Pay More Attention - Neural Architectures for Question-Answering

Arxiv

5+阅读 · 2018年3月25日

微信扫码咨询专知VIP会员