Stochastic and adversarial data are two widely studied settings in online learning. But many optimization tasks are neither i.i.d. nor fully adversarial, which makes it of fundamental interest to get a better theoretical understanding of the world between these extremes. In this work we establish novel regret bounds for online convex optimization in a setting that interpolates between stochastic i.i.d. and fully adversarial losses. By exploiting smoothness of the expected losses, these bounds replace a dependence on the maximum gradient length by the variance of the gradients, which was previously known only for linear losses. In addition, they weaken the i.i.d. assumption by allowing, for example, adversarially poisoned rounds, which were previously considered in the expert and bandit setting. Our results extend this to the online convex optimization framework. In the fully i.i.d. case, our bounds match the rates one would expect from results in stochastic acceleration, and in the fully adversarial case they gracefully deteriorate to match the minimax regret. We further provide lower bounds showing that our regret upper bounds are tight for all intermediate regimes in terms of the stochastic variance and the adversarial variation of the loss gradients.
翻译:在网上学习中,许多优化任务既不是i.d.,也不是完全对立的,这使得人们从理论上更好地了解这些极端之间的世界具有根本的兴趣。在这项工作中,我们为在线 convex优化设定了全新的遗憾界限,在这种背景下,在随机性i.d.和完全对立的损失之间进行交叉。通过利用预期损失的平滑性,这些界限取代了对最大梯度长度的依赖,这些梯度的差异以前只知道线性损失。此外,它们削弱了i.i.d.的假设,例如,允许以前在专家和波段设置中考虑过的对抗性毒弹弹。我们的结果将这种结果扩大到在线convex优化框架。在完全的i.d.情况下,我们的界限与人们预期的慢度加速率相符,在完全敌对性的情况下,它们优柔地恶化,以与微轴的遗憾相匹配。我们进一步提供了较低的界限,表明我们的上层损失等级差是激烈的。