快速 BERT 通过早期输出导出序列标签加速推推 (Accelerating BERT Inference for Sequence Labeling via Early-Exit) - 专知论文

会员服务 ·

0

序列标注 · Performer · Extensibility · 标注 · 推断 ·

2021 年 5 月 28 日

Accelerating BERT Inference for Sequence Labeling via Early-Exit

翻译：快速 BERT 通过早期输出导出序列标签加速推推

Xiaonan Li,Yunfan Shao,Tianxiang Sun,Hang Yan,Xipeng Qiu,Xuanjing Huang

from arxiv, Accepted to the ACL 2021

Both performance and efficiency are crucial factors for sequence labeling tasks in many real-world scenarios. Although the pre-trained models (PTMs) have significantly improved the performance of various sequence labeling tasks, their computational cost is expensive. To alleviate this problem, we extend the recent successful early-exit mechanism to accelerate the inference of PTMs for sequence labeling tasks. However, existing early-exit mechanisms are specifically designed for sequence-level tasks, rather than sequence labeling. In this paper, we first propose a simple extension of sentence-level early-exit for sequence labeling tasks. To further reduce the computational cost, we also propose a token-level early-exit mechanism that allows partial tokens to exit early at different layers. Considering the local dependency inherent in sequence labeling, we employed a window-based criterion to decide for a token whether or not to exit. The token-level early-exit brings the gap between training and inference, so we introduce an extra self-sampling fine-tuning stage to alleviate it. The extensive experiments on three popular sequence labeling tasks show that our approach can save up to 66%-75% inference cost with minimal performance degradation. Compared with competitive compressed models such as DistilBERT, our approach can achieve better performance under the same speed-up ratios of 2X, 3X, and 4X.

翻译：性能和效率是许多现实世界情景中排序标签任务的关键因素。尽管预先培训的模型(PTMs)大大改善了各种序列标签任务的业绩,但其计算成本是昂贵的。为了缓解这一问题,我们推广了最近成功的早期退出机制,以加快PTMs对序列标签任务的推断。然而,现有的提前退出机制是专门为排序任务设计的,而不是顺序标签。在本文件中,我们首先提议为序列标签任务简单延长判刑级的提前退出阶段。为进一步降低计算成本,我们还提议了一个象征性的提前退出机制,允许部分标识在不同的层次上提前退出。考虑到序列标签中固有的局部依赖性,我们采用了基于窗口的标准来决定是否退出。象征性的提前退出机制为培训与推断之间带来差距,因此我们引入了额外的自我抽样微调阶段来缓解它。在三种流行排序任务上的广泛实验表明,我们的方法可以节省到66 %的提前退出标志。在SIRX模式下, 将SB 与最起码的运行速度比为2,在SIR-75B中, 将我们的方法保存到比比为更好的S-R-75。

0

相关内容

序列标注

ICML2021接受论文列表出炉！1184篇论文都在这了！

专知会员服务

92+阅读 · 2021年6月3日

【快讯】ECCV 2020论文出炉，1361篇上榜，你的paper中了吗？

【快讯】ECCV 2020论文出炉，1361篇上榜，你的paper中了吗？

专知会员服务

57+阅读 · 2020年7月3日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

专知会员服务

21+阅读 · 2020年4月30日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【快讯】CVPR2020结果出炉，1470篇上榜，你的paper中了吗？

【快讯】CVPR2020结果出炉，1470篇上榜，你的paper中了吗？

专知会员服务

51+阅读 · 2020年2月24日

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

专知会员服务

61+阅读 · 2019年12月29日

【CCL 2019】如何微调BERT进行文本分类？（How to Fine-Tune BERT for Text Classification?）

【CCL 2019】如何微调BERT进行文本分类？（How to Fine-Tune BERT for Text Classification?）

专知会员服务

84+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【OpenAI】深度强化学习关键论文列表

【OpenAI】深度强化学习关键论文列表

专知

11+阅读 · 2018年11月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新七篇目标检测相关论文—Self Paced、上下文注意力、特征反射、层次特征、Tiny SSD、少样本、协同学习

【论文推荐】最新七篇目标检测相关论文—Self Paced、上下文注意力、特征反射、层次特征、Tiny SSD、少样本、协同学习

专知

6+阅读 · 2018年2月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

End-to-End Spoken Language Understanding for Generalized Voice Assistants

Arxiv

0+阅读 · 2021年7月19日

Self-Supervised Graph Learning with Proximity-based Views and Channel Contrast

Arxiv

0+阅读 · 2021年7月19日

Wasserstein Distances, Geodesics and Barycenters of Merge Trees

Arxiv

0+阅读 · 2021年7月16日

DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference

Arxiv

0+阅读 · 2021年7月16日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Arxiv

21+阅读 · 2020年12月17日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Resolution Adaptive Networks for Efficient Inference

Arxiv

5+阅读 · 2020年3月16日

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Arxiv

11+阅读 · 2019年10月30日

TinyBERT: Distilling BERT for Natural Language Understanding

TinyBERT: Distilling BERT for Natural Language Understanding

Arxiv

11+阅读 · 2019年9月23日

Universal Transformers

Universal Transformers

Arxiv

5+阅读 · 2019年3月5日

VIP会员

文章信息

相关主题

相关VIP内容

ICML2021接受论文列表出炉！1184篇论文都在这了！

专知会员服务

92+阅读 · 2021年6月3日

【快讯】ECCV 2020论文出炉，1361篇上榜，你的paper中了吗？

【快讯】ECCV 2020论文出炉，1361篇上榜，你的paper中了吗？

专知会员服务

57+阅读 · 2020年7月3日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

专知会员服务

21+阅读 · 2020年4月30日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【快讯】CVPR2020结果出炉，1470篇上榜，你的paper中了吗？

【快讯】CVPR2020结果出炉，1470篇上榜，你的paper中了吗？

专知会员服务

51+阅读 · 2020年2月24日

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

专知会员服务

61+阅读 · 2019年12月29日

【CCL 2019】如何微调BERT进行文本分类？（How to Fine-Tune BERT for Text Classification?）

【CCL 2019】如何微调BERT进行文本分类？（How to Fine-Tune BERT for Text Classification?）

专知会员服务

84+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【OpenAI】深度强化学习关键论文列表

【OpenAI】深度强化学习关键论文列表

专知

11+阅读 · 2018年11月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新七篇目标检测相关论文—Self Paced、上下文注意力、特征反射、层次特征、Tiny SSD、少样本、协同学习

【论文推荐】最新七篇目标检测相关论文—Self Paced、上下文注意力、特征反射、层次特征、Tiny SSD、少样本、协同学习

专知

6+阅读 · 2018年2月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

End-to-End Spoken Language Understanding for Generalized Voice Assistants

Arxiv

0+阅读 · 2021年7月19日

Self-Supervised Graph Learning with Proximity-based Views and Channel Contrast

Arxiv

0+阅读 · 2021年7月19日

Wasserstein Distances, Geodesics and Barycenters of Merge Trees

Arxiv

0+阅读 · 2021年7月16日

DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference

Arxiv

0+阅读 · 2021年7月16日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Arxiv

21+阅读 · 2020年12月17日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Resolution Adaptive Networks for Efficient Inference

Arxiv

5+阅读 · 2020年3月16日

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Arxiv

11+阅读 · 2019年10月30日

TinyBERT: Distilling BERT for Natural Language Understanding

TinyBERT: Distilling BERT for Natural Language Understanding

Arxiv

11+阅读 · 2019年9月23日

Universal Transformers

Universal Transformers

Arxiv

5+阅读 · 2019年3月5日

微信扫码咨询专知VIP会员