HRKD: 跨主题语言模型压缩的等级关系知识蒸馏 (HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression) - 专知论文

会员服务 ·

0

语言模型化 · 蒸馏 · Performer · MoDELS · Extensibility ·

2021 年 10 月 16 日

HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression

翻译：HRKD: 跨主题语言模型压缩的等级关系知识蒸馏

Chenhe Dong,Yaliang Li,Ying Shen,Minghui Qiu

from arxiv, EMNLP 2021

On many natural language processing tasks, large pre-trained language models (PLMs) have shown overwhelming performances compared with traditional neural network methods. Nevertheless, their huge model size and low inference speed have hindered the deployment on resource-limited devices in practice. In this paper, we target to compress PLMs with knowledge distillation, and propose a hierarchical relational knowledge distillation (HRKD) method to capture both hierarchical and domain relational information. Specifically, to enhance the model capability and transferability, we leverage the idea of meta-learning and set up domain-relational graphs to capture the relational information across different domains. And to dynamically select the most representative prototypes for each domain, we propose a hierarchical compare-aggregate mechanism to capture hierarchical relationships. Extensive experiments on public multi-domain datasets demonstrate the superior performance of our HRKD method as well as its strong few-shot learning ability. For reproducibility, we release the code at https://github.com/cheneydon/hrkd.

翻译：在许多自然语言处理任务方面,与传统神经网络方法相比,大型的预先培训语言模型(PLM)表现出了压倒性的性能,尽管如此,其庞大的模型规模和低的推论速度阻碍了实际在资源有限的装置上的部署。在本文中,我们的目标是用知识蒸馏来压缩PLM,并提出一种等级关系知识蒸馏法(HRKD),以捕捉等级和域际关系信息。具体地说,为了提高模型能力和可转移性,我们利用元学习的理念,并设置域关系图,以捕捉不同领域的关系信息。为了动态地选择每个领域最具代表性的原型,我们提议了一个等级比较聚合机制,以捕捉等级关系。关于公共多域数据集的广泛实验显示了我们HRKD方法的优异性表现以及其强大的微小的学习能力。关于可复制性,我们在http://github.com/cheneydon/hrkd上发布了代码。

0

相关内容

语言模型化

语言模型化

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

最新《联邦学习Federated Learning》报告，Federated Learning

最新《联邦学习Federated Learning》报告，Federated Learning

专知会员服务

89+阅读 · 2020年12月2日

知识图嵌入和可解释人工智能 Knowledge Graph Embeddings and Explainable AI

知识图嵌入和可解释人工智能 Knowledge Graph Embeddings and Explainable AI

专知会员服务

135+阅读 · 2020年5月1日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

专知会员服务

9+阅读 · 2020年4月17日

【综述】联邦学习的威胁，Threats to Federated Learning: A Survey

【综述】联邦学习的威胁，Threats to Federated Learning: A Survey

专知会员服务

80+阅读 · 2020年3月4日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

【AAAI2020接受论文】隐式关系语言模型，CMU&微软，Latent Relation Language Models

【AAAI2020接受论文】隐式关系语言模型，CMU&微软，Latent Relation Language Models

专知会员服务

54+阅读 · 2019年11月12日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

「Github」多模态机器学习文章阅读列表

「Github」多模态机器学习文章阅读列表

专知

123+阅读 · 2019年8月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Cross-Domain Generalization and Knowledge Transfer in Transformers Trained on Legal Data

Arxiv

0+阅读 · 2021年12月15日

Model Uncertainty-Aware Knowledge Amalgamation for Pre-Trained Language Models

Arxiv

1+阅读 · 2021年12月14日

Out-of-Scope Domain and Intent Classification through Hierarchical Joint Modeling

Arxiv

0+阅读 · 2021年12月14日

DisCo: Effective Knowledge Distillation For Contrastive Learning of Sentence Embeddings

DisCo: Effective Knowledge Distillation For Contrastive Learning of Sentence Embeddings

Arxiv

0+阅读 · 2021年12月10日

Boosting Contrastive Learning with Relation Knowledge Distillation

Arxiv

9+阅读 · 2021年12月8日

Pre-trained Language Models in Biomedical Domain: A Systematic Survey

Arxiv

10+阅读 · 2021年10月12日

Relational Learning with Gated and Attentive Neighbor Aggregator for Few-Shot Knowledge Graph Completion

Arxiv

12+阅读 · 2021年4月27日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

Domain Representation for Knowledge Graph Embedding

Arxiv

3+阅读 · 2019年3月26日

One-Shot Relational Learning for Knowledge Graphs

Arxiv

3+阅读 · 2018年8月27日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

最新《联邦学习Federated Learning》报告，Federated Learning

最新《联邦学习Federated Learning》报告，Federated Learning

专知会员服务

89+阅读 · 2020年12月2日

知识图嵌入和可解释人工智能 Knowledge Graph Embeddings and Explainable AI

知识图嵌入和可解释人工智能 Knowledge Graph Embeddings and Explainable AI

专知会员服务

135+阅读 · 2020年5月1日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

专知会员服务

9+阅读 · 2020年4月17日

【综述】联邦学习的威胁，Threats to Federated Learning: A Survey

【综述】联邦学习的威胁，Threats to Federated Learning: A Survey

专知会员服务

80+阅读 · 2020年3月4日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

【AAAI2020接受论文】隐式关系语言模型，CMU&微软，Latent Relation Language Models

【AAAI2020接受论文】隐式关系语言模型，CMU&微软，Latent Relation Language Models

专知会员服务

54+阅读 · 2019年11月12日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《认知无人机系统综述：AI驱动的态势感知赋能高效作战》

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《北约联合太空作战模拟推演框架构建》

人工智能作为战争武器

相关资讯

「Github」多模态机器学习文章阅读列表

「Github」多模态机器学习文章阅读列表

专知

123+阅读 · 2019年8月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Cross-Domain Generalization and Knowledge Transfer in Transformers Trained on Legal Data

Arxiv

0+阅读 · 2021年12月15日

Model Uncertainty-Aware Knowledge Amalgamation for Pre-Trained Language Models

Arxiv

1+阅读 · 2021年12月14日

Out-of-Scope Domain and Intent Classification through Hierarchical Joint Modeling

Arxiv

0+阅读 · 2021年12月14日

DisCo: Effective Knowledge Distillation For Contrastive Learning of Sentence Embeddings

DisCo: Effective Knowledge Distillation For Contrastive Learning of Sentence Embeddings

Arxiv

0+阅读 · 2021年12月10日

Boosting Contrastive Learning with Relation Knowledge Distillation

Arxiv

9+阅读 · 2021年12月8日

Pre-trained Language Models in Biomedical Domain: A Systematic Survey

Arxiv

10+阅读 · 2021年10月12日

Relational Learning with Gated and Attentive Neighbor Aggregator for Few-Shot Knowledge Graph Completion

Arxiv

12+阅读 · 2021年4月27日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

Domain Representation for Knowledge Graph Embedding

Arxiv

3+阅读 · 2019年3月26日

One-Shot Relational Learning for Knowledge Graphs

Arxiv

3+阅读 · 2018年8月27日

微信扫码咨询专知VIP会员