We study a variant of online convex optimization where the player is permitted to switch decisions at most $S$ times in expectation throughout $T$ rounds. Similar problems have been addressed in prior work for the discrete decision set setting, and more recently in the continuous setting but only with an adaptive adversary. In this work, we aim to fill the gap and present computationally efficient algorithms in the more prevalent oblivious setting, establishing a regret bound of $O(T/S)$ for general convex losses and $\widetilde O(T/S^2)$ for strongly convex losses. In addition, for stochastic i.i.d.~losses, we present a simple algorithm that performs $\log T$ switches with only a multiplicative $\log T$ factor overhead in its regret in both the general and strongly convex settings. Finally, we complement our algorithms with lower bounds that match our upper bounds in some of the cases we consider.
翻译:我们研究的是在线convex优化的变式,即允许玩家在整个T回合中按预期最多S美元的时间转换决定。类似的问题已经在离散决定设置的前期工作中解决,最近也在连续设置中解决,但只是与适应性对手一起解决。在这项工作中,我们的目标是填补空白,在更普遍的模糊环境中提出计算效率高的算法,为一般 convex损失确定O(T/S)$的遗憾,为强烈的 convex损失设定美元(T/S2)的遗憾结合值。此外,对于Stochaticic i.d.~loss,我们提出了一个简单的算法,在一般和强烈的 convex环境中只使用多复制的$\log T$的系数管理费来执行$(log T$)开关。最后,我们用更低的界限来补充我们的算法,与我们所考虑的一些案例的上限相匹配。