机器翻译结果如下：标题：Adam和AdamW 优化器训练的深度神经网络损失函数的Lipschitz影响对泛化性能的影响 (Lipschitzness Effect of a Loss Function on Generalization Performance of Deep Neural Networks Trained by Adam and AdamW Optimizers) - 专知论文

会员服务 ·

0

Lipschitz · Adam · 损失函数 · 泛化 · Lipschitz常数 ·

2023 年 3 月 29 日

Lipschitzness Effect of a Loss Function on Generalization Performance of Deep Neural Networks Trained by Adam and AdamW Optimizers

翻译：机器翻译结果如下：标题：Adam和AdamW 优化器训练的深度神经网络损失函数的Lipschitz影响对泛化性能的影响

Mohammad Lashkari,Amin Gheibi

from arxiv, 13 pages, 6 figures, 3 tables

The generalization performance of deep neural networks with regard to the optimization algorithm is one of the major concerns in machine learning. This performance can be affected by various factors. In this paper, we theoretically prove that the Lipschitz constant of a loss function is an important factor to diminish the generalization error of the output model obtained by Adam or AdamW. The results can be used as a guideline for choosing the loss function when the optimization algorithm is Adam or AdamW. In addition, to evaluate the theoretical bound in a practical setting, we choose the human age estimation problem in computer vision. For assessing the generalization better, the training and test datasets are drawn from different distributions. Our experimental evaluation shows that the loss function with lower Lipschitz constant and maximum value improves the generalization of the model trained by Adam or AdamW.

翻译：摘要：机器学习中深度神经网络的泛化性能与优化算法相关性是主要关注点之一。这种性能受到各种因素的影响。本文从理论上证明了损失函数的Lipschitz常数是减小Adam或AdamW优化器获得输出模型的泛化误差的重要因素之一。该结果可用作选择损失函数的指南，当优化算法为Adam或AdamW时。此外，为了在实际设置中评估理论界限，我们选择了计算机视觉中的人脸年龄估计问题。为了更好地评估泛化，培训和测试数据集来自不同的分布。我们的实验评估显示，Lipschitz常数较低且最大值的损失函数可以提高由Adam或AdamW训练的模型的泛化能力。

0

相关内容

Lipschitz

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【开放新书】可验证深度学习，91页pdf阐述Deep Learning的鲁棒性，提升安全可靠性

【开放新书】可验证深度学习，91页pdf阐述Deep Learning的鲁棒性，提升安全可靠性

专知会员服务

59+阅读 · 2020年4月11日

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

专知会员服务

34+阅读 · 2020年2月27日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

补体C3a在心肌梗死后心脏重塑中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Neolaxiflorin B的全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

纤维结合素对椎间盘基质代谢和退变的影响及其分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

针刺对脑缺血星形胶质细胞-神经元能量偶联的调节机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

小样本空间制图

国家自然科学基金

0+阅读 · 2012年12月31日

IRES调控EV71神经毒性的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

骨髓间充质干细胞抗心肌衰老作用及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于损失函数的统计机器学习算法及其应用研究

国家自然科学基金

8+阅读 · 2009年12月31日

基于改进的支持向量机在语音识别中的应用研究

国家自然科学基金

0+阅读 · 2009年12月31日

Anticorrelated Noise Injection for Improved Generalization

Arxiv

0+阅读 · 2023年5月19日

Multi-Objective Optimization Using the R2 Utility

Arxiv

0+阅读 · 2023年5月19日

Algorithmically Effective Differentially Private Synthetic Data

Arxiv

0+阅读 · 2023年5月18日

Learning Activation Functions for Sparse Neural Networks

Arxiv

0+阅读 · 2023年5月18日

Noise-aware Speech Separation with Contrastive Learning

Arxiv

0+阅读 · 2023年5月18日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Model Complexity of Deep Learning: A Survey

Arxiv

32+阅读 · 2021年3月8日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Arxiv

10+阅读 · 2018年3月20日

VIP会员

文章信息

相关主题

Lipschitz常数

相关VIP内容

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【开放新书】可验证深度学习，91页pdf阐述Deep Learning的鲁棒性，提升安全可靠性

【开放新书】可验证深度学习，91页pdf阐述Deep Learning的鲁棒性，提升安全可靠性

专知会员服务

59+阅读 · 2020年4月11日

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

专知会员服务

34+阅读 · 2020年2月27日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《科研智能：人工智能赋能工业仿真研究报告（2025年）》

具身智能中的世界模型：全面综述

【NeurIPS2025】迈向开放世界的三维“物体性”学习

【博士论文】用于排序与扩散模型的安全、高效与鲁棒强化学习

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

Anticorrelated Noise Injection for Improved Generalization

Arxiv

0+阅读 · 2023年5月19日

Multi-Objective Optimization Using the R2 Utility

Arxiv

0+阅读 · 2023年5月19日

Algorithmically Effective Differentially Private Synthetic Data

Arxiv

0+阅读 · 2023年5月18日

Learning Activation Functions for Sparse Neural Networks

Arxiv

0+阅读 · 2023年5月18日

Noise-aware Speech Separation with Contrastive Learning

Arxiv

0+阅读 · 2023年5月18日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Model Complexity of Deep Learning: A Survey

Arxiv

32+阅读 · 2021年3月8日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Arxiv

10+阅读 · 2018年3月20日

相关基金

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

补体C3a在心肌梗死后心脏重塑中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Neolaxiflorin B的全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

纤维结合素对椎间盘基质代谢和退变的影响及其分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

针刺对脑缺血星形胶质细胞-神经元能量偶联的调节机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

小样本空间制图

国家自然科学基金

0+阅读 · 2012年12月31日

IRES调控EV71神经毒性的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

骨髓间充质干细胞抗心肌衰老作用及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于损失函数的统计机器学习算法及其应用研究

国家自然科学基金

8+阅读 · 2009年12月31日

基于改进的支持向量机在语音识别中的应用研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员