Online mirror descent (OMD) and dual averaging (DA) -- two fundamental algorithms for online convex optimization -- are known to have very similar (and sometimes identical) performance guarantees when used with a fixed learning rate. Under dynamic learning rates, however, OMD is provably inferior to DA and suffers a linear regret, even in common settings such as prediction with expert advice. We modify the OMD algorithm through a simple technique that we call stabilization. We give essentially the same abstract regret bound for OMD with stabilization and for DA by modifying the classical OMD convergence analysis in a careful and modular way that allows for straightforward and flexible proofs. Simple corollaries of these bounds show that OMD with stabilization and DA enjoy the same performance guarantees in many applications -- even under dynamic learning rates. We also shed light on the similarities between OMD and DA and show simple conditions under which stabilized-OMD and DA generate the same iterates.
翻译:在线镜底(OMD)和双平均值(DA) -- -- 两种用于在线曲线优化的基本算法 -- -- 已知在以固定学习率使用时,其性能保障非常相似(有时是相同的),但是,在动态学习率下,OMD明显比DA低,甚至根据专家意见预测等常见环境,也遭受线性遗憾。我们通过我们称之为稳定化的简单技术修改OMD算法。我们给OMD带来基本上相同的抽象的遗憾,给OMD带来稳定,给DA带来同样的遗憾,我们以谨慎和模块化的方式修改传统的OMD趋同分析,允许提供直截了当和灵活的证明。这些界限的简单缩略图显示,即使在动态学习率下,具有稳定性的OMDDA在许多应用中也享有同样的性能保障。我们还阐明了OMD和DA之间的相似之处,并展示了稳定式OMD和DA产生相同的迭代的简单条件。