改进NLP模型与Gated熔化升级的预测向后改进后向-兼容性 (Improving Prediction Backward-Compatiblility in NLP Model Upgrade with Gated Fusion) - 专知论文

会员服务 ·

0

MoDELS · 门控 · 可约的 · NLP · 后向 ·

2023 年 2 月 4 日

Improving Prediction Backward-Compatiblility in NLP Model Upgrade with Gated Fusion

翻译：改进NLP模型与Gated熔化升级的预测向后改进后向-兼容性

Yi-An Lai,Elman Mansimov,Yuqing Xie,Yi Zhang

from arxiv, Camera-ready for EACL 2023 Findings

When upgrading neural models to a newer version, new errors that were not encountered in the legacy version can be introduced, known as regression errors. This inconsistent behavior during model upgrade often outweighs the benefits of accuracy gain and hinders the adoption of new models. To mitigate regression errors from model upgrade, distillation and ensemble have proven to be viable solutions without significant compromise in performance. Despite the progress, these approaches attained an incremental reduction in regression which is still far from achieving backward-compatible model upgrade. In this work, we propose a novel method, Gated Fusion, that promotes backward compatibility via learning to mix predictions between old and new models. Empirical results on two distinct model upgrade scenarios show that our method reduces the number of regression errors by 62% on average, outperforming the strongest baseline by an average of 25%.

翻译：当将神经模型升级为较新版本时,可以引入遗留版本中未遇到的新错误,称为回归错误。模型升级期间的这种不一致行为往往大于准确性收益的好处,并阻碍采用新模型。为了从模型升级中减少回归错误,蒸馏和组合已证明是可行的解决办法,而没有显著的绩效妥协。尽管取得了这些进展,但这些方法还是逐渐地减少了倒退,这远远没有实现与后向兼容的模式升级。在这项工作中,我们提出了一个新颖的方法,即Gated Fusion,通过学习将旧模型和新模型的预测混在一起,促进后向兼容性。两种不同的模型升级情景的经验显示,我们的方法平均将回归错误减少62%,比最强的基线平均减少25%。

0

相关内容

MoDELS

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

大尺度柔性平流层飞艇流固耦合动力学建模与定点保持控制

国家自然科学基金

1+阅读 · 2017年12月31日

柿花性别分化的基因表达和激素调控机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

紫薯糖基化修饰酶Ib3GGT对花青素修饰和富集的研究

国家自然科学基金

0+阅读 · 2015年12月31日

预应力活性粉末混凝土梁抗火性能与抗火设计

国家自然科学基金

0+阅读 · 2013年12月31日

M2L2型水溶性金属-药物配合物的定向合成与抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

缺氧时HIF-1α转录激活自噬蛋白Beclin 1促进鼻咽癌转移机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

SATB1调控肝素酶的表达促进胃癌侵袭转移？

国家自然科学基金

0+阅读 · 2009年12月31日

二茂铁磺酸基功能配合物的合成及其电化学性能研究

国家自然科学基金

0+阅读 · 2008年12月31日

Information-Theoretic GAN Compression with Variational Energy-based Model

Arxiv

0+阅读 · 2023年3月28日

Fast Convergence Federated Learning with Aggregated Gradients

Arxiv

0+阅读 · 2023年3月28日

DisWOT: Student Architecture Search for Distillation WithOut Training

Arxiv

1+阅读 · 2023年3月28日

FAStEN: an efficient adaptive method for feature selection and estimation in high-dimensional functional regressions

Arxiv

0+阅读 · 2023年3月26日

Autoregressive Conditional Neural Processes

Arxiv

0+阅读 · 2023年3月25日

Diversity is Definitely Needed: Improving Model-Agnostic Zero-shot Classification via Stable Diffusion

Arxiv

0+阅读 · 2023年3月24日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

Arxiv

13+阅读 · 2019年11月1日

Estimating Node Importance in Knowledge Graphs Using Graph Neural Networks

Arxiv

25+阅读 · 2019年5月21日

mvn2vec: Preservation and Collaboration in Multi-View Network Embedding

Arxiv

10+阅读 · 2018年1月19日

VIP会员

文章信息

相关主题

相关VIP内容

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Information-Theoretic GAN Compression with Variational Energy-based Model

Arxiv

0+阅读 · 2023年3月28日

Fast Convergence Federated Learning with Aggregated Gradients

Arxiv

0+阅读 · 2023年3月28日

DisWOT: Student Architecture Search for Distillation WithOut Training

Arxiv

1+阅读 · 2023年3月28日

FAStEN: an efficient adaptive method for feature selection and estimation in high-dimensional functional regressions

Arxiv

0+阅读 · 2023年3月26日

Autoregressive Conditional Neural Processes

Arxiv

0+阅读 · 2023年3月25日

Diversity is Definitely Needed: Improving Model-Agnostic Zero-shot Classification via Stable Diffusion

Arxiv

0+阅读 · 2023年3月24日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

Arxiv

13+阅读 · 2019年11月1日

Estimating Node Importance in Knowledge Graphs Using Graph Neural Networks

Arxiv

25+阅读 · 2019年5月21日

mvn2vec: Preservation and Collaboration in Multi-View Network Embedding

Arxiv

10+阅读 · 2018年1月19日

相关基金

大尺度柔性平流层飞艇流固耦合动力学建模与定点保持控制

国家自然科学基金

1+阅读 · 2017年12月31日

柿花性别分化的基因表达和激素调控机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

紫薯糖基化修饰酶Ib3GGT对花青素修饰和富集的研究

国家自然科学基金

0+阅读 · 2015年12月31日

预应力活性粉末混凝土梁抗火性能与抗火设计

国家自然科学基金

0+阅读 · 2013年12月31日

M2L2型水溶性金属-药物配合物的定向合成与抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

缺氧时HIF-1α转录激活自噬蛋白Beclin 1促进鼻咽癌转移机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

SATB1调控肝素酶的表达促进胃癌侵袭转移？

国家自然科学基金

0+阅读 · 2009年12月31日

二茂铁磺酸基功能配合物的合成及其电化学性能研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员