CAMERO: 具有体重共享的受扰动语言模式统一正规组合 (CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing)

Model ensemble is a popular approach to produce a low-variance and well-generalized model. However, it induces large memory and inference costs, which are often not affordable for real-world deployment. Existing work has resorted to sharing weights among models. However, when increasing the proportion of the shared weights, the resulting models tend to be similar, and the benefits of using model ensemble diminish. To retain ensemble benefits while maintaining a low memory cost, we propose a consistency-regularized ensemble learning approach based on perturbed models, named CAMERO. Specifically, we share the weights of bottom layers across all models and apply different perturbations to the hidden representations for different models, which can effectively promote the model diversity. Meanwhile, we apply a prediction consistency regularizer across the perturbed models to control the variance due to the model diversity. Our experiments using large language models demonstrate that CAMERO significantly improves the generalization performance of the ensemble model. Specifically, CAMERO outperforms the standard ensemble of 8 BERT-base models on the GLUE benchmark by 0.7 with a significantly smaller model size (114.2M vs. 880.6M).

翻译：模型组合是一种流行的方法,可以产生一种低差异和广度的模型,然而,它却引起大量的记忆和推断成本,而这种成本往往无法为现实世界的部署所负担得起。现有的工作已经采用在模型之间分享权重的办法。但是,当提高共享权重的比例时,所产生的模型往往相似,使用模型组合减少的好处也相似。为了保留共同效益,同时保持低记忆成本,我们提议一种一致性和常规化的混合学习方法。具体地说,我们在所有模型中分享底层的重量,对不同模型的隐藏表示采用不同的扰动,这可以有效地促进模型的多样性。与此同时,我们采用一个预测一致性的固定化器,在受扰动的模型中控制因模型多样性而产生的差异。我们使用大语言模型的实验表明,CAMRO在保持低记忆成本的同时,大大改进了混合模型的通用性表现。具体地说,CAMERO在所有模型中比8个BER-80基准模型的标准型号要高得多。(GLUE,以0.7为基准,比GER-80M,比GLU。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/