This paper presents competitive algorithms for a novel class of online optimization problems with memory. We consider a setting where the learner seeks to minimize the sum of a hitting cost and a switching cost that depends on the previous $p$ decisions. This setting generalizes Smoothed Online Convex Optimization. The proposed approach, Optimistic Regularized Online Balanced Descent, achieves a constant, dimension-free competitive ratio. Further, we show a connection between online optimization with memory and online control with adversarial disturbances. This connection, in turn, leads to a new constant-competitive policy for a rich class of online control problems.
翻译:本文为新型的在线优化记忆问题提供了具有竞争力的算法。 我们考虑了一个学习者试图最大限度地减少打击成本和切换成本之和(取决于先前的美元决定)的环境。 这个设置概括了平滑的在线 Convex优化。 提议的“ 优化正规化在线平衡派” 方法实现了一个不变的、无维的竞争性比率。 此外, 我们显示了在线优化与记忆和在线控制与对抗性干扰之间的关联。 这一连接反过来又导致对大量在线控制问题采取新的持续竞争政策。