Generalization is one of the fundamental issues in machine learning. However, traditional techniques like uniform convergence may be unable to explain generalization under overparameterization. As alternative approaches, techniques based on \emph{stability} analyze the training dynamics and drive algorithm-dependent generalization bounds. Unfortunately, the stability-based bounds are still far from explaining the surprising generalization in deep learning since neural networks usually suffer from unsatisfactory stability. This paper proposes a novel decomposition framework to improve the stability-based bounds via a more fine-grained analysis of the signal and noise, inspired by the observation that neural networks converge relatively slowly when fitting noise (which indicates better stability). Concretely, we decompose the excess risk dynamics and apply stability-based bound only on the noise component. The decomposition framework performs well in both linear regimes (overparameterized linear regression) and non-linear regimes (diagonal matrix recovery). Experiments on neural networks verify the utility of the decomposition framework.
翻译:普遍化是机器学习的根本问题之一。 但是,统一趋同等传统技术可能无法解释超度参数下的一般化。 作为替代方法,基于 emph{scurity} 的技术分析培训动态和驱动依赖算法的一般化界限。 不幸的是,基于稳定性的界限仍然远远不能解释深层学习中令人惊讶的概括化,因为神经网络通常会受到不令人满意的稳定性的影响。本文件提出一个新的分解框架,通过对信号和噪音进行更精细的分解分析来改进基于稳定性的界限。 这种分析受到以下观察的启发:神经网络在安装噪音时会相对缓慢地聚集(这显示更稳定 ) 。 具体地说,我们分解了过度的风险动态,并且只对噪音部分采用基于稳定性的界限。 分解框架在线性系统(超分度线回归)和非线性系统(直线性系统恢复)和非线性系统(直线性系统)运作良好。 神经网络实验证实了分解框架的效用。