Stochastic and adversarial data are two widely studied settings in online learning. But many optimization tasks are neither i.i.d. nor fully adversarial, which makes it of fundamental interest to get a better theoretical understanding of the world between these extremes. In this work we establish novel regret bounds for online convex optimization in a setting that interpolates between stochastic i.i.d. and fully adversarial losses. By exploiting smoothness of the expected losses, these bounds replace a dependence on the maximum gradient length by the variance of the gradients, which was previously known only for linear losses. In addition, they weaken the i.i.d. assumption by allowing, for example, adversarially poisoned rounds, which were previously considered in the related expert and bandit settings. In the fully i.i.d. case, our regret bounds match the rates one would expect from results in stochastic acceleration, and we also recover the optimal stochastically accelerated rates via online-to-batch conversion. In the fully adversarial case our bounds gracefully deteriorate to match the minimax regret. We further provide lower bounds showing that our regret upper bounds are tight for all intermediate regimes in terms of the stochastic variance and the adversarial variation of the loss gradients.
翻译:在网上学习中,许多优化任务既不是i.d.,也不是完全对立的,这使得人们从理论上更好地了解这些极端之间的世界具有根本的兴趣。在这项工作中,我们为在线 convex优化设定了全新的遗憾界限,在这种环境中,将随机性i.d.和完全对立的损失相互调和。通过利用预期损失的平滑性,这些界限取代了对最大梯度长度的依赖,因为梯度的变异以前只知道线性损失。此外,它们削弱了i.i.d.假设,例如,允许对抗性有毒的回合,而以前在相关的专家和匪帮环境中曾考虑过。在完全i.d.情况下,我们的遗憾界限与预期的速率相符。我们还利用了预期损失的平滑性,通过在线到批量转换恢复了最佳的加速速率。在充分对立的案例中,我们的约束性变宽度变差到与微缩缩缩压的后,与微缩缩缩压后,我们进一步展示了最下调的递减。</s>