In repeated games, strategies are often evaluated by their ability to guarantee the performance of the single best action that is selected in hindsight (a property referred to as \emph{Hannan consistency}, or \emph{no-regret}). However, the effectiveness of the single best action as a yardstick to evaluate strategies is limited, as any static action may perform poorly in common dynamic settings. We propose the notion of \emph{dynamic benchmark consistency}, which requires a strategy to asymptotically guarantee the performance of the best \emph{dynamic} sequence of actions selected in hindsight subject to a constraint on the number of action changes the corresponding dynamic benchmark admits. We show that dynamic benchmark consistent strategies exist if and only if the number of changes in the benchmark scales sublinearly with the horizon length. Further, our main result establishes that the set of empirical joint distributions of play that may emerge, when all players deploy such strategies, asymptotically coincides with the set of \emph{Hannan equilibria} (also referred to as \emph{coarse correlated equilibria}) of the stage game. This general characterization allows one to leverage analyses developed for frameworks that consider static benchmarks, which we demonstrate by bounding the social efficiency of the possible outcomes in our~setting. Together, our results imply that dynamic benchmark consistent strategies introduce the following \emph{Pareto-type} improvement over no-regret strategies: They enable stronger individual guarantees against arbitrary strategies of the other players, while maintaining the same worst-case guarantees on the social welfare, when all players adopt these strategies.
翻译:在重复的游戏中,战略往往以其能力来评价,以保证在事后观察中选择的单一最佳行动的性能(一种称为 emph{Hanann一致性} 或\ emph{no-regret} 的属性)。然而,作为评价战略的尺度,单一最佳行动的效力是有限的,因为在共同的动态环境中,任何静态行动都可能表现不佳。我们建议了一个概念,它需要一种战略来保证在事后观察中选择的、在事后观察中选择的、以行动数量限制改变相应的动态基准承认的单一最佳行动的性能。然而,我们表明,只有在基准尺度与地平面长度相比的变化数量有限的情况下,才存在动态基准一致的战略。此外,我们的主要结果确定,当所有参与者部署这种战略时,所有参与者都以静态的方式, 使个人战略的改进得以实现。(同样地) 也是指,在对相应的行动基准进行任意性保证时,当我们开始一个稳定的游戏的基底值分析时, 当我们的社会基值能够显示我们的社会基值分析时, 的基数级结果。