基于群体参数平均的神经网络融合方法 (PAPA) (PopulAtion Parameter Averaging (PAPA)) - 专知论文

会员服务 ·

0

网络融合 · 集成方法 · CIFAR-10 · 融合 · 神经网络 ·

2023 年 4 月 6 日

PopulAtion Parameter Averaging (PAPA)

翻译：基于群体参数平均的神经网络融合方法 (PAPA)

Alexia Jolicoeur-Martineau,Emy Gervais,Kilian Fatras,Yan Zhang,Simon Lacoste-Julien

from arxiv, Blog post: https://ajolicoeur.wordpress.com/papa/, Code: https://github.com/SamsungSAILMontreal/PAPA

Ensemble methods combine the predictions of multiple models to improve performance, but they require significantly higher computation costs at inference time. To avoid these costs, multiple neural networks can be combined into one by averaging their weights (model soups). However, this usually performs significantly worse than ensembling. Weight averaging is only beneficial when weights are similar enough (in weight or feature space) to average well but different enough to benefit from combining them. Based on this idea, we propose PopulAtion Parameter Averaging (PAPA): a method that combines the generality of ensembling with the efficiency of weight averaging. PAPA leverages a population of diverse models (trained on different data orders, augmentations, and regularizations) while occasionally (not too often, not too rarely) replacing the weights of the networks with the population average of the weights. PAPA reduces the performance gap between averaging and ensembling, increasing the average accuracy of a population of models by up to 1.1% on CIFAR-10, 2.4% on CIFAR-100, and 1.9% on ImageNet when compared to training independent (non-averaged) models.

翻译：集成方法将多个模型的预测结果结合起来以提高性能。但是，在推断时，这通常需要更高的计算资源。为了避免这些成本，可以将多个神经网络的权重进行平均以得到一个更加简单的模型。但是，与集成方法相比，这种方法通常的性能更差。权重平均仅有在权重足够相似（在权重或特征空间中）以进行良好平均但足够不同以从中受益时才有益。基于这个思想，我们提出了基于群体参数平均的神经网络融合方法：(PAPA)。PAPA利用多样性模型群体（在不同的数据顺序、扩增方法和正则化方案上进行训练），同时周期性地替换网络权重为平均后的群体平均值。与训练独立的模型相比，PAPA可以将模型群体的平均准确度在CIFAR-10上提高1.1%、在CIFAR-100上提高2.4%、在ImageNet上提高1.9%。

0

相关内容

网络融合

【ICML2022】Sharp-MAML:锐度感知的模型无关元学习

【ICML2022】Sharp-MAML:锐度感知的模型无关元学习

专知会员服务

17+阅读 · 2022年6月10日

【CVPR 2022】基于本地正则化和稀疏化差分隐私的联邦学习，Differentially Private Federated Learning with Local Regularization and Sparsification

【CVPR 2022】基于本地正则化和稀疏化差分隐私的联邦学习，Differentially Private Federated Learning with Local Regularization and Sparsification

专知会员服务

17+阅读 · 2022年3月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【WWW 2020 】基于关系对抗网络的低资源知识图谱补全，Relation Adversarial Network for Low Resource Knowledge Graph Completion

【WWW 2020 】基于关系对抗网络的低资源知识图谱补全，Relation Adversarial Network for Low Resource Knowledge Graph Completion

专知会员服务

37+阅读 · 2020年6月7日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【ICLR2020】五篇Open代码的GNN论文

【ICLR2020】五篇Open代码的GNN论文

专知会员服务

48+阅读 · 2019年10月2日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【论文推荐】最新十篇推荐系统相关论文—内容感知、图卷积神经网络、博弈论、个性化排序、元学习、xDeepFM

【论文推荐】最新十篇推荐系统相关论文—内容感知、图卷积神经网络、博弈论、个性化排序、元学习、xDeepFM

专知

21+阅读 · 2018年6月18日

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

专知

10+阅读 · 2018年4月22日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

多重假设检验中的k-FWER控制

国家自然科学基金

0+阅读 · 2015年12月31日

随机波动率模型的统计推断及数值解

国家自然科学基金

1+阅读 · 2015年12月31日

基于区间二型模糊集的语言群体决策模型及应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于多保真度模型动态融合的多学科设计优化方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

有线/无线异构网络控制系统性能分析及信息协处理

国家自然科学基金

0+阅读 · 2012年12月31日

蛋白质亚核定位及其特征信息的理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

模糊推理中规则约简模型及其相关研究

国家自然科学基金

0+阅读 · 2011年12月31日

3G基站位置与参数配置的建模和优化算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

平差准则带参数时的平差理论与方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

Federated Variational Inference: Towards Improved Personalization and Generalization

Arxiv

0+阅读 · 2023年5月23日

Communication-minimizing Asynchronous Tensor Parallelism

Arxiv

0+阅读 · 2023年5月22日

Parallel Attention and Feed-Forward Net Design for Pre-training and Inference on Transformers

Arxiv

0+阅读 · 2023年5月22日

A Multiple Parameter Linear Scale-Space for one dimensional Signal Classification

Arxiv

0+阅读 · 2023年5月22日

PoNet: Pooling Network for Efficient Token Mixing in Long Sequences

Arxiv

0+阅读 · 2023年5月22日

Federated Transfer-Ordered-Personalized Learning for Driver Monitoring Application

Arxiv

0+阅读 · 2023年5月22日

A Scalable Neural Network for DSIC Affine Maximizer Auction Design

Arxiv

0+阅读 · 2023年5月20日

Nonparametric classification with missing data

Arxiv

0+阅读 · 2023年5月19日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

Heterogeneous Graph Transformer

Heterogeneous Graph Transformer

Arxiv

27+阅读 · 2020年3月3日

VIP会员

文章信息

相关主题

相关VIP内容

【ICML2022】Sharp-MAML:锐度感知的模型无关元学习

【ICML2022】Sharp-MAML:锐度感知的模型无关元学习

专知会员服务

17+阅读 · 2022年6月10日

【CVPR 2022】基于本地正则化和稀疏化差分隐私的联邦学习，Differentially Private Federated Learning with Local Regularization and Sparsification

【CVPR 2022】基于本地正则化和稀疏化差分隐私的联邦学习，Differentially Private Federated Learning with Local Regularization and Sparsification

专知会员服务

17+阅读 · 2022年3月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【WWW 2020 】基于关系对抗网络的低资源知识图谱补全，Relation Adversarial Network for Low Resource Knowledge Graph Completion

【WWW 2020 】基于关系对抗网络的低资源知识图谱补全，Relation Adversarial Network for Low Resource Knowledge Graph Completion

专知会员服务

37+阅读 · 2020年6月7日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【ICLR2020】五篇Open代码的GNN论文

【ICLR2020】五篇Open代码的GNN论文

专知会员服务

48+阅读 · 2019年10月2日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【论文推荐】最新十篇推荐系统相关论文—内容感知、图卷积神经网络、博弈论、个性化排序、元学习、xDeepFM

【论文推荐】最新十篇推荐系统相关论文—内容感知、图卷积神经网络、博弈论、个性化排序、元学习、xDeepFM

专知

21+阅读 · 2018年6月18日

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

专知

10+阅读 · 2018年4月22日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Federated Variational Inference: Towards Improved Personalization and Generalization

Arxiv

0+阅读 · 2023年5月23日

Communication-minimizing Asynchronous Tensor Parallelism

Arxiv

0+阅读 · 2023年5月22日

Parallel Attention and Feed-Forward Net Design for Pre-training and Inference on Transformers

Arxiv

0+阅读 · 2023年5月22日

A Multiple Parameter Linear Scale-Space for one dimensional Signal Classification

Arxiv

0+阅读 · 2023年5月22日

PoNet: Pooling Network for Efficient Token Mixing in Long Sequences

Arxiv

0+阅读 · 2023年5月22日

Federated Transfer-Ordered-Personalized Learning for Driver Monitoring Application

Arxiv

0+阅读 · 2023年5月22日

A Scalable Neural Network for DSIC Affine Maximizer Auction Design

Arxiv

0+阅读 · 2023年5月20日

Nonparametric classification with missing data

Arxiv

0+阅读 · 2023年5月19日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

Heterogeneous Graph Transformer

Heterogeneous Graph Transformer

Arxiv

27+阅读 · 2020年3月3日

相关基金

多重假设检验中的k-FWER控制

国家自然科学基金

0+阅读 · 2015年12月31日

随机波动率模型的统计推断及数值解

国家自然科学基金

1+阅读 · 2015年12月31日

基于区间二型模糊集的语言群体决策模型及应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于多保真度模型动态融合的多学科设计优化方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

有线/无线异构网络控制系统性能分析及信息协处理

国家自然科学基金

0+阅读 · 2012年12月31日

蛋白质亚核定位及其特征信息的理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

模糊推理中规则约简模型及其相关研究

国家自然科学基金

0+阅读 · 2011年12月31日

3G基站位置与参数配置的建模和优化算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

平差准则带参数时的平差理论与方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员