This work explores three-player game training dynamics, under what conditions three-player games converge and the equilibria the converge on. In contrast to prior work, we examine a three-player game architecture in which all players explicitly interact with each other. Prior work analyzes games in which two of three agents interact with only one other player, constituting dual two-player games. We explore three-player game training dynamics using an extended version of a simplified bilinear smooth game, called a simplified trilinear smooth game. We find that trilinear games do not converge on the Nash equilibrium in most cases, rather converging on a fixed point which is optimal for two players, but not for the third. Further, we explore how the order of the updates influences convergence. In addition to alternating and simultaneous updates, we explore a new update order--maximizer-first--which is only possible in a three-player game. We find that three-player games can converge on a Nash equilibrium using maximizer-first updates. Finally, we experiment with differing momentum values for each player in a trilinear smooth game under all three update orders and show that maximizer-first updates achieve more optimal results in a larger set of player-specific momentum value triads than other update orders.
翻译:这项工作探索三玩者游戏训练动态, 在何种条件下, 三个玩家游戏会交汇, 以及这种交汇。 与先前的工作相比, 我们检查一个三玩者游戏结构, 所有玩家都会在其中彼此明确互动。 先前的工作分析游戏, 其中三个玩家中的两个与另一个玩家互动, 构成双玩家游戏。 我们探索三玩者游戏训练动态, 使用一个简化双线平滑游戏的扩展版本, 称为简化的三线平滑游戏 。 我们发现, 多数情况下, 三线游戏不会在纳什均衡上趋同, 而不是在两个玩家最理想的固定点上趋同, 而不是第三个游戏。 此外, 我们探索了更新更新的顺序如何影响趋同。 除了交替和同时更新外, 我们探索一个新的更新顺序- MAximizer- 1 的游戏, 只能在一个三玩家游戏中进行。 我们发现, 三个玩家游戏可以使用最优化的第一次更新, 与每个玩家在一个三线间游戏中不同的动力值, 在所有三个最优化的更新顺序下, 最优化的更新后, 显示最优化的顺序中, 最优化的更新顺序和显示最优化的三色的最大结果。