Rényi 散度的深度互学习 (Rényi Divergence Deep Mutual Learning) - 专知论文

会员服务 ·

0

散度 · 随机梯度下降法 · 广义模型 · 最坏情况 · 收敛性 ·

2023 年 4 月 3 日

Rényi Divergence Deep Mutual Learning

翻译：Rényi 散度的深度互学习

Weipeng Huang,Junjie Tao,Changbo Deng,Ming Fan,Wenqiang Wan,Qi Xiong,Guangyuan Piao

This paper revisits Deep Mutual Learning (DML), a simple yet effective computing paradigm. We propose using R\'{e}nyi divergence instead of the KL divergence, which is more flexible and tunable, to improve vanilla DML. This modification is able to consistently improve performance over vanilla DML with limited additional complexity. The convergence properties of the proposed paradigm are analyzed theoretically, and Stochastic Gradient Descent with a constant learning rate is shown to converge with $\mathcal{O}(1)$-bias in the worst case scenario for nonconvex optimization tasks. That is, learning will reach nearby local optima but continue searching within a bounded scope, which may help mitigate overfitting. Finally, our extensive empirical results demonstrate the advantage of combining DML and R\'{e}nyi divergence, which further improves generalized models.

翻译：本文重新审视了深度互学习（DML）这一简单而有效的计算范型。我们提出使用 R\'{e}nyi 散度代替 KL 散度，后者更加灵活和可调节，以改进基本 DML。这种修改能够通过有限的额外复杂性持续提高性能。对所提出范型的收敛性进行了理论分析，证明了具有常数学习率的随机梯度下降法在非凸优化任务的最坏情况下会收敛到 $\mathcal{O}(1)$ 偏差。也就是，学习将到达附近的局部最优解，但在有限范围内继续搜索，这可能有助于缓解过拟合。最后，我们广泛的实证结果证明了结合 DML 和 R\'{e}nyi 散度的优势，使广义模型进一步提高。

0

相关内容

【AAAI2023】对抗性权重扰动提高图神经网络的泛化能力

【AAAI2023】对抗性权重扰动提高图神经网络的泛化能力

专知会员服务

19+阅读 · 2022年12月12日

【NeurIPS 2021】设置多智能体策略梯度的方差

【NeurIPS 2021】设置多智能体策略梯度的方差

专知会员服务

21+阅读 · 2021年10月24日

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

专知会员服务

41+阅读 · 2020年7月23日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

专知会员服务

77+阅读 · 2020年6月28日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【DLBM-SS暑期课程】深度学习与贝叶斯方法 Deep Learning and Bayesian Methods

【DLBM-SS暑期课程】深度学习与贝叶斯方法 Deep Learning and Bayesian Methods

专知会员服务

67+阅读 · 2019年11月10日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

适定的多元样条逼近方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向高光谱图像处理的函数型数据学习方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

统计反问题的贝叶斯方法及其在地质物理上的应用

国家自然科学基金

1+阅读 · 2013年12月31日

神经网络随机学习算法的泛化性研究

国家自然科学基金

2+阅读 · 2013年12月31日

基于稀疏编码模型的深层学习神经网络

国家自然科学基金

7+阅读 · 2012年12月31日

复杂数据下联合均值与方差模型的统计推断

国家自然科学基金

1+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

约束无导数最优化问题的理论与方法及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

Takagi-Sugeno 模糊广义系统逼近原理的研究

国家自然科学基金

0+阅读 · 2011年12月31日

球面学习理论研究

国家自然科学基金

1+阅读 · 2008年12月31日

Active Learning Principles for In-Context Learning with Large Language Models

Arxiv

0+阅读 · 2023年5月23日

Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape

Arxiv

0+阅读 · 2023年5月19日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

A continual learning survey: Defying forgetting in classification tasks

Arxiv

32+阅读 · 2021年4月16日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Few-shot Learning: A Survey

Few-shot Learning: A Survey

Arxiv

363+阅读 · 2019年4月10日

Large Margin Few-Shot Learning

Arxiv

11+阅读 · 2018年7月8日

Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly

Arxiv

18+阅读 · 2018年1月15日

VIP会员

文章信息

相关主题

随机梯度下降法

相关VIP内容

【AAAI2023】对抗性权重扰动提高图神经网络的泛化能力

【AAAI2023】对抗性权重扰动提高图神经网络的泛化能力

专知会员服务

19+阅读 · 2022年12月12日

【NeurIPS 2021】设置多智能体策略梯度的方差

【NeurIPS 2021】设置多智能体策略梯度的方差

专知会员服务

21+阅读 · 2021年10月24日

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

【伯克利-Ke Li】学习优化，74页ppt，Learning to Optimize

专知会员服务

41+阅读 · 2020年7月23日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

专知会员服务

77+阅读 · 2020年6月28日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【DLBM-SS暑期课程】深度学习与贝叶斯方法 Deep Learning and Bayesian Methods

【DLBM-SS暑期课程】深度学习与贝叶斯方法 Deep Learning and Bayesian Methods

专知会员服务

67+阅读 · 2019年11月10日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Active Learning Principles for In-Context Learning with Large Language Models

Arxiv

0+阅读 · 2023年5月23日

Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape

Arxiv

0+阅读 · 2023年5月19日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

A continual learning survey: Defying forgetting in classification tasks

Arxiv

32+阅读 · 2021年4月16日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Few-shot Learning: A Survey

Few-shot Learning: A Survey

Arxiv

363+阅读 · 2019年4月10日

Large Margin Few-Shot Learning

Arxiv

11+阅读 · 2018年7月8日

Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly

Arxiv

18+阅读 · 2018年1月15日

相关基金

适定的多元样条逼近方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向高光谱图像处理的函数型数据学习方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

统计反问题的贝叶斯方法及其在地质物理上的应用

国家自然科学基金

1+阅读 · 2013年12月31日

神经网络随机学习算法的泛化性研究

国家自然科学基金

2+阅读 · 2013年12月31日

基于稀疏编码模型的深层学习神经网络

国家自然科学基金

7+阅读 · 2012年12月31日

复杂数据下联合均值与方差模型的统计推断

国家自然科学基金

1+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

约束无导数最优化问题的理论与方法及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

Takagi-Sugeno 模糊广义系统逼近原理的研究

国家自然科学基金

0+阅读 · 2011年12月31日

球面学习理论研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员