Stochastic differential equations of Langevin-diffusion form have received significant attention, thanks to their foundational role in both Bayesian sampling algorithms and optimization in machine learning. In the latter, they serve as a conceptual model of the stochastic gradient flow in training over-parametrized models. However, the literature typically assumes smoothness of the potential, whose gradient is the drift term. Nevertheless, there are many problems, for which the potential function is not continuously differentiable, and hence the drift is not Lipschitz continuous everywhere. This is exemplified by robust losses and Rectified Linear Units in regression problems. In this paper, we show some foundational results regarding the flow and asymptotic properties of Langevin-type Stochastic Differential Inclusions under assumptions appropriate to the machine-learning settings. In particular, we show strong existence of the solution, as well as asymptotic minimization of the canonical free-energy functional.
翻译:Langevin-divolution形式的沙变方程式由于在Bayesian抽样算法和优化机器学习中的基本作用而受到极大关注,在机器学习中,它们作为培训过度平衡模型中随机梯度流的概念模型,然而,文献通常假定潜力是平滑的,其梯度是漂移的术语。然而,有许多问题,其潜在功能无法持续地区别,因此漂移并非无处不在。这表现在回归问题中的强力损失和纠正线性单位。在本文中,我们在与机器学习环境相适应的假设下,展示了兰格温型斯托恰托斯卡差异化的流程和无源特性的一些基本结果。特别是,我们展示了解决方案的强大存在,以及无源自由能源功能的微小最小化。