以连成内部分类器提前退出 (Early Exiting with Ensemble Internal Classifiers) - 专知论文

会员服务 ·

0

模型评估 · SimPLe · 目标函数 · 集成 · 推断 ·

2021 年 5 月 28 日

Early Exiting with Ensemble Internal Classifiers

翻译：以连成内部分类器提前退出

Tianxiang Sun,Yunhua Zhou,Xiangyang Liu,Xinyu Zhang,Hao Jiang,Zhao Cao,Xuanjing Huang,Xipeng Qiu

As a simple technique to accelerate inference of large-scale pre-trained models, early exiting has gained much attention in the NLP community. It allows samples to exit early at internal classifiers without passing through the entire model. Most existing work usually trains the internal classifiers independently and employs an exiting strategy to decide whether or not to exit based on the confidence of the current internal classifier. However, none of these works takes full advantage of the fact that the internal classifiers are trained to solve the same task therefore can be used to construct an ensemble. In this paper, we show that a novel objective function for the training of the ensemble internal classifiers can be naturally induced from the perspective of ensemble learning and information theory. The proposed training objective consists of two terms: one for accuracy and the other for the diversity of the internal classifiers. In contrast, the objective used in prior work is exactly the accuracy term of our training objective therefore only optimizes the accuracy but not diversity. Further, we propose a simple voting-based strategy that considers predictions of all the past internal classifiers to infer the correct label and decide whether to exit. Experimental results on various NLP tasks show that our proposed objective function and voting-based strategy can achieve better accuracy-speed trade-offs.

翻译：作为加快大规模预先培训模式推论的简单技术,早期退出在NLP社区中引起了很大的注意。它使样本可以在不通过整个模型的情况下提前从内部分类器中退出,现有工作大多是独立培训内部分类器,并使用一种退出战略,以根据当前内部分类器的信心决定是否退出。然而,这些工程都没有充分利用内部分类器受过培训以完成同样任务,因此,可以用来构建一个组合。在本文中,我们表明,从共同学习和信息理论的角度自然可以引出培训内部分类器的新目标功能。拟议培训目标包括两个术语:一个是准确性,另一个是内部分类器的多样性。相比之下,以前工作的目标恰恰是我们培训目标的准确性术语,因此只能优化准确性,而不是多样性。此外,我们提出一个简单的基于投票的战略,即考虑对过去所有内部分类器进行预测,从而推导出正确的标签,并决定是否实现更准确性退出目标。实验性任务显示各种NL 实验性任务能够显示我们所提议的目标值。

1

相关内容

模型评估

机器学习系统设计系统评估标准

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【AAAI 2020】将深度学习与逻辑融合用于信息提取（Integrating Deep Learning with Logic Fusion for Information Extraction）

【AAAI 2020】将深度学习与逻辑融合用于信息提取（Integrating Deep Learning with Logic Fusion for Information Extraction）

专知会员服务

66+阅读 · 2019年12月28日

【Freddy Lecue 博士|公开演讲】关于知识图谱在可解释AI中的作用（On The Role of Knowledge Graphs in Explainable AI.），International Semantic Web Conference 2019

【Freddy Lecue 博士|公开演讲】关于知识图谱在可解释AI中的作用（On The Role of Knowledge Graphs in Explainable AI.），International Semantic Web Conference 2019

专知会员服务

73+阅读 · 2019年11月11日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Encoders and Ensembles for Task-Free Continual Learning

Encoders and Ensembles for Task-Free Continual Learning

Arxiv

0+阅读 · 2021年7月19日

Self-Supervised Graph Learning with Proximity-based Views and Channel Contrast

Arxiv

0+阅读 · 2021年7月19日

Multi-Level Contrastive Learning for Few-Shot Problems

Arxiv

0+阅读 · 2021年7月15日

Data-Driven Diverse Ensembles of Logistic Regression Models

Arxiv

0+阅读 · 2021年7月15日

Fine-grained Angular Contrastive Learning with Coarse Labels

Arxiv

9+阅读 · 2020年12月7日

Hyperparameter Ensembles for Robustness and Uncertainty Quantification

Arxiv

12+阅读 · 2020年6月24日

Learning to Propagate for Graph Meta-Learning

Arxiv

14+阅读 · 2019年9月11日

Multi-Label Learning with Label Enhancement

Multi-Label Learning with Label Enhancement

Arxiv

4+阅读 · 2019年4月16日

Integrating Semantic Knowledge to Tackle Zero-shot Text Classification

Arxiv

6+阅读 · 2019年3月29日

Multi-class Classification without Multi-class Labels

Multi-class Classification without Multi-class Labels

Arxiv

4+阅读 · 2019年1月2日

VIP会员

文章信息

相关主题

相关VIP内容

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【AAAI 2020】将深度学习与逻辑融合用于信息提取（Integrating Deep Learning with Logic Fusion for Information Extraction）

【AAAI 2020】将深度学习与逻辑融合用于信息提取（Integrating Deep Learning with Logic Fusion for Information Extraction）

专知会员服务

66+阅读 · 2019年12月28日

【Freddy Lecue 博士|公开演讲】关于知识图谱在可解释AI中的作用（On The Role of Knowledge Graphs in Explainable AI.），International Semantic Web Conference 2019

【Freddy Lecue 博士|公开演讲】关于知识图谱在可解释AI中的作用（On The Role of Knowledge Graphs in Explainable AI.），International Semantic Web Conference 2019

专知会员服务

73+阅读 · 2019年11月11日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Encoders and Ensembles for Task-Free Continual Learning

Encoders and Ensembles for Task-Free Continual Learning

Arxiv

0+阅读 · 2021年7月19日

Self-Supervised Graph Learning with Proximity-based Views and Channel Contrast

Arxiv

0+阅读 · 2021年7月19日

Multi-Level Contrastive Learning for Few-Shot Problems

Arxiv

0+阅读 · 2021年7月15日

Data-Driven Diverse Ensembles of Logistic Regression Models

Arxiv

0+阅读 · 2021年7月15日

Fine-grained Angular Contrastive Learning with Coarse Labels

Arxiv

9+阅读 · 2020年12月7日

Hyperparameter Ensembles for Robustness and Uncertainty Quantification

Arxiv

12+阅读 · 2020年6月24日

Learning to Propagate for Graph Meta-Learning

Arxiv

14+阅读 · 2019年9月11日

Multi-Label Learning with Label Enhancement

Multi-Label Learning with Label Enhancement

Arxiv

4+阅读 · 2019年4月16日

Integrating Semantic Knowledge to Tackle Zero-shot Text Classification

Arxiv

6+阅读 · 2019年3月29日

Multi-class Classification without Multi-class Labels

Multi-class Classification without Multi-class Labels

Arxiv

4+阅读 · 2019年1月2日

微信扫码咨询专知VIP会员