We consider a stochastic version of the proximal point algorithm for optimization problems posed on a Hilbert space. A typical application of this is supervised learning. While the method is not new, it has not been extensively analyzed in this form. Indeed, most related results are confined to the finite-dimensional setting, where error bounds could depend on the dimension of the space. On the other hand, the few existing results in the infinite-dimensional setting only prove very weak types of convergence, owing to weak assumptions on the problem. In particular, there are no results that show convergence with a rate. In this article, we bridge these two worlds by assuming more regularity of the optimization problem, which allows us to prove convergence with an (optimal) sub-linear rate also in an infinite-dimensional setting. In particular, we assume that the objective function is the expected value of a family of convex differentiable functions. While we require that the full objective function is strongly convex, we do not assume that its constituent parts are so. Further, we require that the gradient satisfies a weak local Lipschitz continuity property, where the Lipschitz constant may grow polynomially given certain guarantees on the variance and higher moments near the minimum. We illustrate these results by discretizing a concrete infinite-dimensional classification problem with varying degrees of accuracy.
翻译:我们考虑的是Hilbert 空间最佳化问题的近点算法的简单版本。 这个方法的典型应用是受到监督的学习。 虽然这个方法不是新的, 但还没有以这种形式对它进行广泛的分析。 事实上, 大多数相关结果都局限于有限维度的设置, 其中误差的界限可能取决于空间的维度。 另一方面, 无限维度设置中的现有结果仅仅证明是非常薄弱的趋同类型, 因为对问题的假设不力。 特别是, 没有任何结果显示与一个比率趋同。 在本篇文章中, 我们通过假设优化问题更加规律化来弥补这两个世界, 从而使我们能够证明与( 最佳) 亚线性比率的趋同性, 而在无限维度的设置中, 多数相关结果也以无限维度为基础。 特别是, 我们假设, 目标功能是连接不同功能的组合的预期值。 虽然我们要求完全目标功能是强烈的, 我们并不认为它的组成部分是如此的。 此外, 我们要求梯度能满足一个较弱的本地利普西茨连续性属性, 使得一个较弱的地方性强的连续性特性特性属性, 使得这些精确度的精确度的精确度能以最小化的精确度变化变化化。