多阶段跨语言语义相似性比对多阶段蒸馏框架 (Multi-stage Distillation Framework for Cross-Lingual Semantic Similarity Matching) - 专知论文

会员服务 ·

0

Performer · 蒸馏 · 语义相似度 · 知识 (knowledge) · 相似度 ·

2022 年 9 月 13 日

Multi-stage Distillation Framework for Cross-Lingual Semantic Similarity Matching

翻译：多阶段跨语言语义相似性比对多阶段蒸馏框架

Kunbo Ding,Weijie Liu,Yuejian Fang,Zhe Zhao,Qi Ju,Xuefeng Yang

from arxiv, Published at Findings of NAACL, 2022

Previous studies have proved that cross-lingual knowledge distillation can significantly improve the performance of pre-trained models for cross-lingual similarity matching tasks. However, the student model needs to be large in this operation. Otherwise, its performance will drop sharply, thus making it impractical to be deployed to memory-limited devices. To address this issue, we delve into cross-lingual knowledge distillation and propose a multi-stage distillation framework for constructing a small-size but high-performance cross-lingual model. In our framework, contrastive learning, bottleneck, and parameter recurrent strategies are combined to prevent performance from being compromised during the compression process. The experimental results demonstrate that our method can compress the size of XLM-R and MiniLM by more than 50\%, while the performance is only reduced by about 1%.

翻译：先前的研究已经证明,跨语言知识蒸馏可以显著改善经过培训的跨语言相似性任务模型的性能。但是,学生模型在这项操作中需要大得多。否则,其性能会急剧下降,因此无法将它运用到记忆限制装置中。为了解决这个问题,我们深入到跨语言知识蒸馏中,并为构建一个小型但高性能跨语言模式提出一个多阶段的蒸馏框架。在我们的框架里,对比性学习、瓶颈和参数重复性战略会结合在一起,以防止压缩过程中的性能受损。实验结果显示,我们的方法可以压缩XLM-R和MiniLM的体积,而其性能只减少约1%。

0

相关内容

Performer

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

SOX9介导TGF-β/Smads/wnt和β-catenin信号通路调控青少年椎体骺板软骨的分化

国家自然科学基金

0+阅读 · 2014年12月31日

基于多模态信息的针刺治疗中风后抑郁效应评价

国家自然科学基金

0+阅读 · 2014年12月31日

面向图像检索的互补哈希表构造方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

钙钛矿型材料中磁电耦合的第一性原理研究

国家自然科学基金

0+阅读 · 2012年12月31日

铝偏析法提纯中固液界面异质原子迁移行为

国家自然科学基金

0+阅读 · 2012年12月31日

弱分类器的选择与集成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

动态污点分析中的隐式信息流分析方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于对象形变预测的图像对象分割方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于ICA方法的针刺穴位点特异性研究

国家自然科学基金

0+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

On the failure of variational score matching for VAE models

Arxiv

0+阅读 · 2022年10月24日

MARS: Meta-Learning as Score Matching in the Function Space

Arxiv

0+阅读 · 2022年10月24日

Dataset Distillation using Neural Feature Regression

Arxiv

0+阅读 · 2022年10月24日

PAEG: Phrase-level Adversarial Example Generation for Neural Machine Translation

Arxiv

0+阅读 · 2022年10月24日

Reusing Keywords for Fine-grained Representations and Matchings

Arxiv

0+阅读 · 2022年10月21日

A Template-based Method for Constrained Neural Machine Translation

Arxiv

0+阅读 · 2022年10月21日

Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation

Arxiv

0+阅读 · 2022年10月21日

Relational Learning with Gated and Attentive Neighbor Aggregator for Few-Shot Knowledge Graph Completion

Arxiv

12+阅读 · 2021年4月27日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

Cross-lingual Knowledge Graph Alignment via Graph Matching Neural Network

Arxiv

15+阅读 · 2019年5月28日

VIP会员

文章信息

相关主题

语义相似度

知识 (knowledge)

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

模型提取攻击与防御的系统综述：最新进展与展望

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

【CMU博士论文】用于物理模拟的高效深度学习模型

大模型解决方案白皮书：社交陪伴场景全流程落地指南

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

相关论文

On the failure of variational score matching for VAE models

Arxiv

0+阅读 · 2022年10月24日

MARS: Meta-Learning as Score Matching in the Function Space

Arxiv

0+阅读 · 2022年10月24日

Dataset Distillation using Neural Feature Regression

Arxiv

0+阅读 · 2022年10月24日

PAEG: Phrase-level Adversarial Example Generation for Neural Machine Translation

Arxiv

0+阅读 · 2022年10月24日

Reusing Keywords for Fine-grained Representations and Matchings

Arxiv

0+阅读 · 2022年10月21日

A Template-based Method for Constrained Neural Machine Translation

Arxiv

0+阅读 · 2022年10月21日

Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation

Arxiv

0+阅读 · 2022年10月21日

Relational Learning with Gated and Attentive Neighbor Aggregator for Few-Shot Knowledge Graph Completion

Arxiv

12+阅读 · 2021年4月27日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

Cross-lingual Knowledge Graph Alignment via Graph Matching Neural Network

Arxiv

15+阅读 · 2019年5月28日

相关基金

SOX9介导TGF-β/Smads/wnt和β-catenin信号通路调控青少年椎体骺板软骨的分化

国家自然科学基金

0+阅读 · 2014年12月31日

基于多模态信息的针刺治疗中风后抑郁效应评价

国家自然科学基金

0+阅读 · 2014年12月31日

面向图像检索的互补哈希表构造方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

钙钛矿型材料中磁电耦合的第一性原理研究

国家自然科学基金

0+阅读 · 2012年12月31日

铝偏析法提纯中固液界面异质原子迁移行为

国家自然科学基金

0+阅读 · 2012年12月31日

弱分类器的选择与集成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

动态污点分析中的隐式信息流分析方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于对象形变预测的图像对象分割方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于ICA方法的针刺穴位点特异性研究

国家自然科学基金

0+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员