潜在运动会独立自然政策分级法:有限时时全球与内星规律化的趋同 (Independent Natural Policy Gradient Methods for Potential Games: Finite-time Global Convergence with Entropy Regularization) - 专知论文

会员服务 ·

0

Agent · 相互独立的 · 正则化项 · 势函数 · 泛函 ·

2022 年 8 月 29 日

Independent Natural Policy Gradient Methods for Potential Games: Finite-time Global Convergence with Entropy Regularization

翻译：潜在运动会独立自然政策分级法:有限时时全球与内星规律化的趋同

Shicong Cen,Fan Chen,Yuejie Chi

A major challenge in multi-agent systems is that the system complexity grows dramatically with the number of agents as well as the size of their action spaces, which is typical in real world scenarios such as autonomous vehicles, robotic teams, network routing, etc. It is hence in imminent need to design decentralized or independent algorithms where the update of each agent is only based on their local observations without the need of introducing complex communication/coordination mechanisms. In this work, we study the finite-time convergence of independent entropy-regularized natural policy gradient (NPG) methods for potential games, where the difference in an agent's utility function due to unilateral deviation matches exactly that of a common potential function. The proposed entropy-regularized NPG method enables each agent to deploy symmetric, decentralized, and multiplicative updates according to its own payoff. We show that the proposed method converges to the quantal response equilibrium (QRE) -- the equilibrium to the entropy-regularized game -- at a sublinear rate, which is independent of the size of the action space and grows at most sublinearly with the number of agents. Appealingly, the convergence rate further becomes independent with the number of agents for the important special case of identical-interest games, leading to the first method that converges at a dimension-free rate. Our approach can be used as a smoothing technique to find an approximate Nash equilibrium (NE) of the unregularized problem without assuming that stationary policies are isolated.

翻译：多试剂系统的一项重大挑战是,系统的复杂性随着代理人的数量及其行动空间的大小而急剧增加,这在诸如自主汽车、机器人团队、网络路由等现实世界情景中是典型的。因此,迫切需要设计分散或独立的算法,其中每个代理人的更新仅以其当地观测为基础,而无需引入复杂的通信/协调机制。在这项工作中,我们研究的是,独立昆虫-正规化自然政策梯度(NPG)方法对潜在游戏的有限时间趋同,其中,由于单方面偏差造成的代理人的效用功能差异与共同的潜在功能完全吻合。拟议的通缩式NPG方法使每个代理人能够根据其本身的回报来部署对称、分散和多复制性更新的算法。我们表明,拟议的方法与四面反应平衡(QRE) -- -- 与环球-正规化的游戏的平衡 -- 以亚线性速度计算,与行动空间的大小无关,在最下线上与最下线性的潜在功能功能完全吻合。拟议的通俗 NPGPG方法使得我们使用的惯性游戏的稳性标准比率更接近,可以进一步接近于一个稳定的利率。

0

相关内容

Agent

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

自旋轨道耦合BEC系统的混沌特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

低维量子多体系统中的新奇拓扑量子数与特征量子相变的几何方法

国家自然科学基金

0+阅读 · 2013年12月31日

基于遗传神经网络的大地电磁非线性反演

国家自然科学基金

0+阅读 · 2013年12月31日

有限环上线性码及其Gray象的应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

一类微分包含动力系统吸引子分岔和吸引域演化

国家自然科学基金

0+阅读 · 2011年12月31日

自适应正交分段多项式系的构造、性质及其应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

超导量子电路中量子态的测量和控制

国家自然科学基金

0+阅读 · 2009年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

有界噪声激励下非线性系统的全局动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

非线性不连续系统的稳定与镇定

国家自然科学基金

0+阅读 · 2008年12月31日

Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement Learning

Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement Learning

Arxiv

0+阅读 · 2022年10月18日

STay-ON-the-Ridge: Guaranteed Convergence to Local Minimax Equilibrium in Nonconvex-Nonconcave Games

Arxiv

0+阅读 · 2022年10月18日

On Gradient Descent Convergence beyond the Edge of Stability

Arxiv

0+阅读 · 2022年10月18日

Sample-Efficient Reinforcement Learning of Partially Observable Markov Games

Arxiv

0+阅读 · 2022年10月17日

On the convergence of policy gradient methods to Nash equilibria in general stochastic games

Arxiv

0+阅读 · 2022年10月17日

Near-Optimal No-Regret Learning Dynamics for General Convex Games

Arxiv

0+阅读 · 2022年10月16日

On the User Behavior Leakage from Recommender Exposure

Arxiv

0+阅读 · 2022年10月16日

Decentralized Policy Gradient for Nash Equilibria Learning of General-sum Stochastic Games

Arxiv

0+阅读 · 2022年10月14日

Learning Distributed and Fair Policies for Network Load Balancing as Markov Potential Game

Arxiv

0+阅读 · 2022年10月14日

Secure Multiparty Computation for Synthetic Data Generation from Distributed Data

Arxiv

0+阅读 · 2022年10月13日

VIP会员

文章信息

相关主题

相互独立的

相关VIP内容

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据要素发展报告(2025年)：附下载

人工智能代理提升战时舰船战备水平

【NeurIPS2025教程】大语言模型规划

NeurIPS 2025 教程：深度学习训练不稳定性的理论洞见

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement Learning

Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement Learning

Arxiv

0+阅读 · 2022年10月18日

STay-ON-the-Ridge: Guaranteed Convergence to Local Minimax Equilibrium in Nonconvex-Nonconcave Games

Arxiv

0+阅读 · 2022年10月18日

On Gradient Descent Convergence beyond the Edge of Stability

Arxiv

0+阅读 · 2022年10月18日

Sample-Efficient Reinforcement Learning of Partially Observable Markov Games

Arxiv

0+阅读 · 2022年10月17日

On the convergence of policy gradient methods to Nash equilibria in general stochastic games

Arxiv

0+阅读 · 2022年10月17日

Near-Optimal No-Regret Learning Dynamics for General Convex Games

Arxiv

0+阅读 · 2022年10月16日

On the User Behavior Leakage from Recommender Exposure

Arxiv

0+阅读 · 2022年10月16日

Decentralized Policy Gradient for Nash Equilibria Learning of General-sum Stochastic Games

Arxiv

0+阅读 · 2022年10月14日

Learning Distributed and Fair Policies for Network Load Balancing as Markov Potential Game

Arxiv

0+阅读 · 2022年10月14日

Secure Multiparty Computation for Synthetic Data Generation from Distributed Data

Arxiv

0+阅读 · 2022年10月13日

相关基金

自旋轨道耦合BEC系统的混沌特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

低维量子多体系统中的新奇拓扑量子数与特征量子相变的几何方法

国家自然科学基金

0+阅读 · 2013年12月31日

基于遗传神经网络的大地电磁非线性反演

国家自然科学基金

0+阅读 · 2013年12月31日

有限环上线性码及其Gray象的应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

一类微分包含动力系统吸引子分岔和吸引域演化

国家自然科学基金

0+阅读 · 2011年12月31日

自适应正交分段多项式系的构造、性质及其应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

超导量子电路中量子态的测量和控制

国家自然科学基金

0+阅读 · 2009年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

有界噪声激励下非线性系统的全局动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

非线性不连续系统的稳定与镇定

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员