We introduce a simple but general online learning framework in which a learner plays against an adversary in a vector-valued game that changes every round. Even though the learner's objective is not convex-concave (and so the minimax theorem does not apply), we give a simple algorithm that can compete with the setting in which the adversary must announce their action first, with optimally diminishing regret. We demonstrate the power of our framework by using it to (re)derive optimal bounds and efficient algorithms across a variety of domains, ranging from multicalibration to a large set of no regret algorithms, to a variant of Blackwell's approachability theorem for polytopes with fast convergence rates. As a new application, we show how to ``(multi)calibeat'' an arbitrary collection of forecasters -- achieving an exponentially improved dependence on the number of models we are competing against, compared to prior work.
翻译:我们引入了一个简单而一般的在线学习框架,让学习者在矢量估值游戏中对抗对手,这种游戏会改变每一回合。即使学习者的目标不是 convex-concave(因此迷你max定理不适用),但我们给出了一个简单的算法,可以与对手首先宣布其行动的环境竞争,并尽量减少遗憾。我们展示了我们的框架的力量,将它用于(再)最佳的界限和各种领域的高效算法,从多校正到一大套无遗憾算法,到Blackwell对快速趋同率的多面的可接近性标语的变体。作为一个新的应用,我们展示了如何“(多)calibeat”任意收集预测器,与以前的工作相比,对我们竞争的模型数量有了迅速的增强。