We present a novel control-theoretic understanding of online optimization and learning in games, via the notion of passivity. Passivity is a fundamental concept in control theory, which abstracts energy conservation and dissipation in physical systems. It has become a standard tool in analysis of general feedback systems, to which game dynamics belong. Our starting point is to show that all continuous-time Follow-the-Regularized-Leader (FTRL) dynamics, which includes the well-known Replicator Dynamic, are lossless, i.e. it is passive with no energy dissipation. Interestingly, we prove that passivity implies bounded regret, connecting two fundamental primitives of control theory and online optimization. The observation of energy conservation in FTRL inspires us to present a family of lossless learning dynamics, each of which has an underlying energy function with a simple gradient structure. This family is closed under convex combination; as an immediate corollary, any convex combination of FTRL dynamics is lossless and thus has bounded regret. This allows us to extend the framework of Fox and Shamma (Games, 2013) to prove not just global asymptotic stability results for game dynamics, but Poincar\'e recurrence results as well. Intuitively, when a lossless game (e.g. graphical constant-sum game) is coupled with lossless learning dynamic, their interconnection is also lossless, which results in a pendulum-like energy-preserving recurrent behavior, generalizing the results of Piliouras and Shamma (SODA, 2014) and Mertikopoulos, Papadimitriou and Piliouras (SODA, 2018).
翻译:我们通过被动概念展示了对在线优化和游戏中学习的新型控制理论理解。被动是控制理论中的一个基本概念,它总结了物理系统中的节能和消散。它已成为分析一般反馈系统的标准工具,游戏动力属于这种系统。我们的出发点是显示所有连续时间的“追踪-再分类-引导”动态(FTRL),其中包括众所周知的“复制者”动态,是无损的,也就是说,它没有节能。有意思的是,我们证明“被动”意味着受约束的遗憾,连接了控制理论和在线优化的两个基本原始源。FTRL对能源节能的观察激励我们展示了无损学习动态的组合,每个系统都有简单的梯度结构的基本能量功能。这个组合在 convex的组合下被封闭;作为直接的必然结果,FTRL动态的任何螺旋组合都是无损的,因此也令人感到遗憾。这使我们能够扩展Fox和Shamma(Gamels,2013年)的反复互连锁的游戏结果而不是游戏性学习结果。