We provide new adaptive first-order methods for constrained convex optimization. Our main algorithms AdaACSA and AdaAGD+ are accelerated methods, which are universal in the sense that they achieve nearly-optimal convergence rates for both smooth and non-smooth functions, even when they only have access to stochastic gradients. In addition, they do not require any prior knowledge on how the objective function is parametrized, since they automatically adjust their per-coordinate learning rate. These can be seen as truly accelerated Adagrad methods for constrained optimization. We complement them with a simpler algorithm AdaGrad+ which enjoys the same features, and achieves the standard non-accelerated convergence rate. We also present a set of new results involving adaptive methods for unconstrained optimization and monotone operators.
翻译:我们为限制 convex 优化提供了新的适应性第一阶方法。 我们的主要算法 AdaACSA 和 AdaAGD+ 是加速法, 它们是通用的, 其普遍意义是, 这些算法在光滑和非吸附功能上都达到接近最佳的趋同率, 即使它们只能使用随机梯度。 此外, 它们并不要求事先知道目标功能是如何被平衡化的, 因为它们自动调整了各自的每个相协调的学习率。 这些算法可以被看作是真正加速的 Adagrad 优化方法。 我们用一个具有相同特点的简单算法 AdaGrad+ 来补充这些算法, 并达到标准的非加速趋同率。 我们还提出了一套新结果, 涉及对不受限制的优化和单调控操作者的适应方法。