Learning in zero-sum games studies a situation where multiple agents competitively learn their strategy. In such multi-agent learning, we often see that the strategies cycle around their optimum, i.e., Nash equilibrium. When a game periodically varies (called a ``periodic'' game), however, the Nash equilibrium moves generically. How learning dynamics behave in such periodic games is of interest but still unclear. Interestingly, we discover that the behavior is highly dependent on the relationship between the two speeds at which the game changes and at which players learn. We observe that when these two speeds synchronize, the learning dynamics diverge, and their time-average does not converge. Otherwise, the learning dynamics draw complicated cycles, but their time-average converges. Under some assumptions introduced for the dynamical systems analysis, we prove that this behavior occurs. Furthermore, our experiments observe this behavior even if removing these assumptions. This study discovers a novel phenomenon, i.e., synchronization, and gains insight widely applicable to learning in periodic games.
翻译:暂无翻译