We propose a unified analytic and variational framework for studying stability in deep learning systems viewed as coupled representation-parameter dynamics. The central object is the Learning Stability Profile, which tracks the infinitesimal response of representations, parameters, and update mechanisms to perturbations along the learning trajectory. We prove a Fundamental Analytic Stability Theorem showing that uniform boundedness of these stability signatures is equivalent, up to norm equivalence, to the existence of a Lyapunov-type energy that dissipates along the learning flow. In smooth regimes, the framework yields explicit stability exponents linking spectral norms, activation regularity, step sizes, and learning rates to contractivity of the learning dynamics. Classical spectral stability results for feedforward networks, a discrete CFL-type condition for residual architectures, and parametric and temporal stability laws for stochastic gradient methods arise as direct consequences. The theory extends to non-smooth learning systems, including ReLU networks, proximal and projected updates, and stochastic subgradient flows, by replacing classical derivatives with Clarke generalized derivatives and smooth energies with variational Lyapunov functionals. The resulting framework provides a unified dynamical description of stability across architectures and optimization methods, clarifying how architectural and algorithmic choices jointly govern robustness and sensitivity to perturbations. It also provides a foundation for further extensions to continuous-time limits and geometric formulations of learning dynamics.
翻译:我们提出了一个统一的解析与变分框架,用于研究被视为耦合表示-参数动力学的深度学习系统的稳定性。其核心对象是学习稳定性剖面,它追踪学习轨迹上表示、参数及更新机制对微小扰动的瞬时响应。我们证明了一个基本解析稳定性定理:在范数等价的意义下,这些稳定性特征的一致有界性等价于存在一个沿学习流耗散的类李雅普诺夫能量。在光滑区域,该框架可导出显式的稳定性指数,将谱范数、激活函数正则性、步长和学习率与学习动力学的压缩性联系起来。前馈网络的经典谱稳定性结果、残差架构的离散CFL型条件,以及随机梯度方法的参数与时间稳定性定律,均作为直接推论而得出。该理论通过用Clarke广义导数替代经典导数,并用变分李雅普诺夫泛函替代光滑能量,可推广至非光滑学习系统,包括ReLU网络、近端与投影更新,以及随机次梯度流。所得框架为跨架构和优化方法的稳定性提供了统一的动力学描述,阐明了架构和算法选择如何共同决定系统对扰动的鲁棒性和敏感性。该框架也为进一步推广至学习动力学的连续时间极限和几何表述奠定了基础。