通过经验风险景观对超风险的充分定性 (A Full Characterization of Excess Risk via Empirical Risk Landscape) - 专知论文

会员服务 ·

0

经验风险 · 全 · 平滑 · 损失 · contrastive ·

2021 年 1 月 29 日

A Full Characterization of Excess Risk via Empirical Risk Landscape

翻译：通过经验风险景观对超风险的充分定性

Mingyang Yi,Ruoyu Wang,Zhi-Ming Ma

from arxiv, 38pages

In this paper, we provide a unified analysis of the excess risk of the model trained by a proper algorithm with both smooth convex and non-convex loss functions. In contrast to the existing bounds in the literature that depends on iteration steps, our bounds to the excess risk do not diverge with the number of iterations. This underscores that, at least for smooth loss functions, the excess risk can be guaranteed after training. To get the bounds to excess risk, we develop a technique based on algorithmic stability and non-asymptotic characterization of the empirical risk landscape. The model obtained by a proper algorithm is proved to generalize with this technique. Specifically, for non-convex loss, the conclusion is obtained via the technique and analyzing the stability of a constructed auxiliary algorithm. Combining this with some properties of the empirical risk landscape, we derive converged upper bounds to the excess risk in both convex and non-convex regime with the help of some classical optimization results.

翻译：在本文中,我们提供了对模型过重风险的统一分析,该模型由光滑的卷轴和非卷轴损失功能的适当算法所培训。与文献中取决于迭代步骤的现有界限不同,我们对超重风险的界限与迭代次数没有差异。这突出表明,至少对于平稳损失功能而言,在培训后可以保证超重风险。为了让这些界限达到超重风险,我们开发了一种基于算法稳定性和对经验风险景观的非零散定性的技术。通过适当算法获得的模型被证明与这一技术相提并论。具体来说,对于非阴道损失,结论是通过技术和分析构建的辅助算法的稳定性获得的。将这一点与经验风险景观的某些特性结合起来,我们在一些经典优化结果的帮助下,在康韦克斯和非卷轴制度下,将超重风险的界限与一些超重风险相融合。

0

相关内容

经验风险

经验风险是对训练集中的所有样本点损失函数的平均最小化。经验风险越小说明模型f(X)对训练集的拟合程度越好。

【干货书】鲁棒优化Robust Optimization，570页pdf

专知会员服务

144+阅读 · 2021年3月17日

【ICLR2021】彩色化变换器，Colorization Transformer

【ICLR2021】彩色化变换器，Colorization Transformer

专知会员服务

10+阅读 · 2021年2月9日

【论文推荐】文本摘要简述

【论文推荐】文本摘要简述

专知会员服务

69+阅读 · 2020年7月20日

【IJCAI2020】统计相关模型，A Complete Characterization of Projectivity for Statistical Relational Models

【IJCAI2020】统计相关模型，A Complete Characterization of Projectivity for Statistical Relational Models

专知会员服务

20+阅读 · 2020年4月25日

【伯克利】再思考 Transformer中的Batch Normalization

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

41+阅读 · 2020年3月21日

【MIT】时间序列GAN，Subadditivity of Probability Divergences

专知会员服务

63+阅读 · 2020年3月4日

【论文推荐】Short Text Classiﬁcation via Term Graph 基于术语图的短文本分类

【论文推荐】Short Text Classiﬁcation via Term Graph 基于术语图的短文本分类

专知会员服务

20+阅读 · 2020年1月20日

【新书】贝叶斯网络进展与新应用，附全书下载

【新书】贝叶斯网络进展与新应用，附全书下载

专知会员服务

122+阅读 · 2019年12月9日

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

专知会员服务

13+阅读 · 2019年11月22日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

已删除

将门创投

7+阅读 · 2019年10月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

A Unifying View on Implicit Bias in Training Linear Neural Networks

Arxiv

0+阅读 · 2021年3月24日

Characterizing the Zeta Distribution via Continuous Mixtures

Arxiv

0+阅读 · 2021年3月23日

Stochastic Reweighted Gradient Descent

Arxiv

0+阅读 · 2021年3月23日

Stability and Deviation Optimal Risk Bounds with Convergence Rate $O(1/n)$

Arxiv

0+阅读 · 2021年3月22日

Weighted Neural Tangent Kernel: A Generalized and Improved Network-Induced Kernel

Arxiv

1+阅读 · 2021年3月22日

A Theoretical Analysis of the Repetition Problem in Text Generation

Arxiv

0+阅读 · 2021年3月22日

Path Length Bounds for Gradient Descent and Flow

Arxiv

0+阅读 · 2021年3月19日

Landscape analysis for shallow ReLU neural networks: complete classification of critical points for affine target functions

Arxiv

0+阅读 · 2021年3月19日

Graphs of Joint Types, Noninteractive Simulation, and Stronger Hypercontractivity

Arxiv

0+阅读 · 2021年3月19日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】鲁棒优化Robust Optimization，570页pdf

专知会员服务

144+阅读 · 2021年3月17日

【ICLR2021】彩色化变换器，Colorization Transformer

【ICLR2021】彩色化变换器，Colorization Transformer

专知会员服务

10+阅读 · 2021年2月9日

【论文推荐】文本摘要简述

【论文推荐】文本摘要简述

专知会员服务

69+阅读 · 2020年7月20日

【IJCAI2020】统计相关模型，A Complete Characterization of Projectivity for Statistical Relational Models

【IJCAI2020】统计相关模型，A Complete Characterization of Projectivity for Statistical Relational Models

专知会员服务

20+阅读 · 2020年4月25日

【伯克利】再思考 Transformer中的Batch Normalization

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

41+阅读 · 2020年3月21日

【MIT】时间序列GAN，Subadditivity of Probability Divergences

专知会员服务

63+阅读 · 2020年3月4日

【论文推荐】Short Text Classiﬁcation via Term Graph 基于术语图的短文本分类

【论文推荐】Short Text Classiﬁcation via Term Graph 基于术语图的短文本分类

专知会员服务

20+阅读 · 2020年1月20日

【新书】贝叶斯网络进展与新应用，附全书下载

【新书】贝叶斯网络进展与新应用，附全书下载

专知会员服务

122+阅读 · 2019年12月9日

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

专知会员服务

13+阅读 · 2019年11月22日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

已删除

将门创投

7+阅读 · 2019年10月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

A Unifying View on Implicit Bias in Training Linear Neural Networks

Arxiv

0+阅读 · 2021年3月24日

Characterizing the Zeta Distribution via Continuous Mixtures

Arxiv

0+阅读 · 2021年3月23日

Stochastic Reweighted Gradient Descent

Arxiv

0+阅读 · 2021年3月23日

Stability and Deviation Optimal Risk Bounds with Convergence Rate $O(1/n)$

Arxiv

0+阅读 · 2021年3月22日

Weighted Neural Tangent Kernel: A Generalized and Improved Network-Induced Kernel

Arxiv

1+阅读 · 2021年3月22日

A Theoretical Analysis of the Repetition Problem in Text Generation

Arxiv

0+阅读 · 2021年3月22日

Path Length Bounds for Gradient Descent and Flow

Arxiv

0+阅读 · 2021年3月19日

Landscape analysis for shallow ReLU neural networks: complete classification of critical points for affine target functions

Arxiv

0+阅读 · 2021年3月19日

Graphs of Joint Types, Noninteractive Simulation, and Stronger Hypercontractivity

Arxiv

0+阅读 · 2021年3月19日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

微信扫码咨询专知VIP会员