We improve the framework of open games with agency by showing how the players' counterfactual analysis giving rise to Nash equilibria can be described in the dynamics of the game itself (hence diegetically), getting rid of devices such as equilibrium predicates. This new approach overlaps almost completely with the way gradient-based learners are specified and trained. Indeed, we show feedback propagation in games can be seen as reverse-mode differentiation, with a crucial difference explaining the distinctive character of the phenomenology of non-cooperative games. We outline a functorial construction of arena of games, show players form a subsystem over it, and prove that their `fixpoint behaviours' are Nash equilibria.
翻译:我们通过展示游戏本身的动态性能来描述那些导致Nash平衡的玩家反事实分析,从而摆脱诸如平衡的上游等装置。这种新方法几乎完全与基于梯度的学习者被指定和培训的方式重叠。 事实上,我们在游戏中显示的反馈传播可被视为反模式差异,其关键差异可以解释不合作游戏的生理特征。 我们概述了游戏场的演练结构,展示玩家形成一个子系统,并证明他们的“基点行为”是Nash equilibria。