通过因果效应估计法对域进行适应的模型压缩 (Model Compression for Domain Adaptation through Causal Effect Estimation)

Recent improvements in the predictive quality of natural language processing systems are often dependent on a substantial increase in the number of model parameters. This has led to various attempts of compressing such models, but existing methods have not considered the differences in the predictive power of various model components or in the generalizability of the compressed models. To understand the connection between model compression and out-of-distribution generalization, we define the task of compressing language representation models such that they perform best in a domain adaptation setting. We choose to address this problem from a causal perspective, attempting to estimate the average treatment effect (ATE) of a model component, such as a single layer, on the model's predictions. Our proposed ATE-guided Model Compression scheme (AMoC), generates many model candidates, differing by the model components that were removed. Then, we select the best candidate through a stepwise regression model that utilizes the ATE to predict the expected performance on the target domain. AMoC outperforms strong baselines on dozens of domain pairs across three text classification and sequence tagging tasks.

翻译：最近自然语言处理系统的预测质量的改善往往取决于模型参数数目的大量增加。这导致各种压缩模型的尝试,但现有方法没有考虑到各种模型组成部分的预测力或压缩模型的一般性的差异。为了理解模型压缩和分配外的概括性之间的联系,我们界定了压缩语言代表模型的任务,使其在领域适应环境中表现最佳。我们选择从因果角度解决这一问题,试图估计模型预测中单层等模型组成部分的平均处理效果。我们提议的ATE引导模型压缩方案(AMC)产生了许多模型候选人,而模型组成部分被删除了。然后,我们通过一个渐进回归模型选择最佳候选人,利用ATE来预测目标领域的预期业绩。AMC超越了在三个文本分类和顺序标记任务中几十对域配的强大基线。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

注意力机制综述

专知会员服务

210+阅读 · 2021年1月26日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

专知会员服务

43+阅读 · 2020年4月22日

【知识迁移视觉识别综述论文】Knowledge Transfer in Vision Recognition: A Survey

专知会员服务

30+阅读 · 2020年4月19日