连续运动会渐进学习动态的稳定:矢量行动空间 (Stability of Gradient Learning Dynamics in Continuous Games: Vector Action Spaces)

Towards characterizing the optimization landscape of games, this paper analyzes the stability of gradient-based dynamics near fixed points of two-player continuous games. We introduce the quadratic numerical range as a method to characterize the spectrum of game dynamics and prove the robustness of equilibria to variations in learning rates. By decomposing the game Jacobian into symmetric and skew-symmetric components, we assess the contribution of a vector field's potential and rotational components to the stability of differential Nash equilibria. Our results show that in zero-sum games, all Nash are stable and robust; in potential games, all stable points are Nash. For general-sum games, we provide a sufficient condition for instability. We conclude with a numerical example in which learning with timescale separation results in faster convergence.

翻译：本文旨在描述游戏的最佳景观, 分析两玩者连续游戏固定点附近基于梯度的动态的稳定性。我们引入了二次数字范围, 以此来描述游戏动态的范围, 并证明对学习率变化的平衡性。通过将游戏Jacobian 分解为对称和扭曲对称成分, 我们评估矢量字段的潜力和旋转组件对差异Nash均衡稳定的贡献。我们的结果表明, 在零和游戏中, 所有Nash都是稳定和稳健的; 在潜在游戏中, 所有稳定点都是 Nash 。对于普通和游戏, 我们为不稳定提供了充分的条件。我们以一个数字例子来结束我们学习时间尺度分离的结果, 更快的融合。

相关内容

Continuity

关注 0

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

【ICML2020-伯克利】稳定非策略强化学习的表示，Representations for Stable Off-Policy Reinforcement Learning

专知会员服务

17+阅读 · 2020年7月14日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日