$\ mathcal{ O} (\\ sqrt{T})$( 受约束的在线优化的遗憾: 渐变和镜像Prox (Beyond $\mathcal{O}(\sqrt{T})$ Regret for Constrained Online Optimization: Gradual Variations and Mirror Prox)

We study constrained online convex optimization, where the constraints consist of a relatively simple constraint set (e.g. a Euclidean ball) and multiple functional constraints. Projections onto such decision sets are usually computationally challenging. So instead of enforcing all constraints over each slot, we allow decisions to violate these functional constraints but aim at achieving a low regret and a low cumulative constraint violation over a horizon of $T$ time slot. The best known bound for solving this problem is $\mathcal{O}(\sqrt{T})$ regret and $\mathcal{O}(1)$ constraint violation, whose algorithms and analysis are restricted to Euclidean spaces. In this paper, we propose a new online primal-dual mirror prox algorithm whose regret is measured via a total gradient variation $V_*(T)$ over a sequence of $T$ loss functions. Specifically, we show that the proposed algorithm can achieve an $\mathcal{O}(\sqrt{V_*(T)})$ regret and $\mathcal{O}(1)$ constraint violation simultaneously. Such a bound holds in general non-Euclidean spaces, is never worse than the previously known $\big( \mathcal{O}(\sqrt{T}), \mathcal{O}(1) \big)$ result, and can be much better on regret when the variation is small. Furthermore, our algorithm is computationally efficient in that only two mirror descent steps are required during each slot instead of solving a general Lagrangian minimization problem. Along the way, our bounds also improve upon those of previous attempts using mirror-prox-type algorithms solving this problem, which yield a relatively worse $\mathcal{O}(T^{2/3})$ regret and $\mathcal{O}(T^{2/3})$ constraint violation.

翻译：我们研究限制在线 convex 优化, 限制包括相对简单的限制( 如 Euclidean ball ) 和多重功能限制。对此类决定的预测通常具有计算上的挑战性。因此, 我们允许决定违反这些功能限制, 但目的是在$T的时段范围内实现低遗憾和低累积限制违约。最已知的解决这一问题的限度是 $\ mathcal{O} (\ qrt{T}) 遗憾和 $\ mathal{ O} (美元) 限制违约, 其算法和分析仅限于 Eucliidean 空间。在本文中, 我们提出一个新的线性镜像 prox 算法, 其遗憾是通过总梯度变差 $V\ (T) 和美元损失函数序列实现的。具体地说, 我们提议的算法可以达到$mathcal{ O} (sqration} listal) 问题。 (sror) 和 $\ mathal{O} ( ladeal droad) ( lade) lax) rude) 在一般变数中, 中, rode the rude deal a more a more_ rude rude rude rude rude rudeus ax a mus a mus ax a n lex ax ax ax ax ax ax ax ax ax ax ax ax ax ax ax ax lex le le le le le le lex ax lex lex le lex a lex a lex a lex a lex a ex a le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le le ex ex ex ex a ex ex ex ex ex ex ex ex ex le le le