前瞻性最佳反应多倍倍增效力最新方法 (Forward Looking Best-Response Multiplicative Weights Update Methods) - 专知论文

会员服务 ·

0

Weight · 纳什均衡 · 前向 · Performer · 收缩 ·

2021 年 6 月 7 日

Forward Looking Best-Response Multiplicative Weights Update Methods

翻译：前瞻性最佳反应多倍倍增效力最新方法

Michail Fasoulakis,Evangelos Markakis,Yannis Pantazis,Constantinos Varsos

We propose a novel variant of the \emph{multiplicative weights update method} with forward-looking best-response strategies, that guarantees last-iterate convergence for \emph{zero-sum games} with a unique \emph{Nash equilibrium}. Particularly, we show that the proposed algorithm converges to an $\eta^{1/\rho}$-approximate Nash equilibrium, with $\rho > 1$, by decreasing the Kullback-Leibler divergence of each iterate by a rate of at least $\Omega(\eta^{1+\frac{1}{\rho}})$, for sufficiently small learning rate $\eta$. When our method enters a sufficiently small neighborhood of the solution, it becomes a contraction and converges to the Nash equilibrium of the game. Furthermore, we perform an experimental comparison with the recently proposed optimistic variant of the multiplicative weights update method, by \cite{Daskalakis2019LastIterateCZ}, which has also been proved to attain last-iterate convergence. Our findings reveal that our algorithm offers substantial gains both in terms of the convergence rate and the region of contraction relative to the previous approach.

翻译：我们提出了一个具有前瞻性最佳应对战略的新变式,即 \ emph{ 倍增加权更新法, 保证 \ emph{ 零和游戏} 与 { 纳什平衡} 的独特。特别是, 我们显示, 提议的算法与 $ { 1/\\ rh} 相近的纳什平衡相融合, 以 $ > 1 美元为单位, 通过降低 Kullback- Leiber 差异, 以至少 $ / Omega (\ ⁇ 1 { { { { { { { { { { 1 { { { { ⁇ } { { { { ⁇ { { ⁇ { ⁇ } 。 } 保证足够小的学习率, $ 。当我们的方法进入一个足够小的解决方案附近时,, 它就会变成收缩, 与游戏的纳什平衡一致。此外, 我们用最近提出的多复制权重的比较变式方法进行了实验性比较比较比较比较比较,, 通过\, 通过\ 。

0

相关内容

Weight

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

人工智能顶会WSDM2021优秀论文奖(Best Paper Award Runner-Up)出炉

人工智能顶会WSDM2021优秀论文奖(Best Paper Award Runner-Up)出炉

专知会员服务

19+阅读 · 2021年2月19日

最新《非光滑优化》十讲硬核课程，剑桥大学梁经纬博士主讲

最新《非光滑优化》十讲硬核课程，剑桥大学梁经纬博士主讲

专知会员服务

34+阅读 · 2020年8月14日

策略梯度方法的算子视图，An operator view of policy gradient methods

策略梯度方法的算子视图，An operator view of policy gradient methods

专知会员服务

11+阅读 · 2020年6月23日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

124+阅读 · 2020年5月30日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

85+阅读 · 2020年2月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Algorithms for Energy Conservation in Heterogeneous Data Centers

Arxiv

0+阅读 · 2021年7月30日

Contemporary Symbolic Regression Methods and their Relative Performance

Arxiv

0+阅读 · 2021年7月29日

A new non-linear instability for scalar fields

A new non-linear instability for scalar fields

Arxiv

0+阅读 · 2021年7月29日

Consensus-Based Optimization on the Sphere: Convergence to Global Minimizers and Machine Learning

Arxiv

0+阅读 · 2021年7月28日

Super-localization of elliptic multiscale problems

Arxiv

0+阅读 · 2021年7月28日

Policy Gradient Methods Find the Nash Equilibrium in N-player General-sum Linear-quadratic Games

Arxiv

0+阅读 · 2021年7月27日

RingBFT: Resilient Consensus over Sharded Ring Topology

Arxiv

0+阅读 · 2021年7月27日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Neural Response Generation with Dynamic Vocabularies

Arxiv

5+阅读 · 2017年11月30日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

人工智能顶会WSDM2021优秀论文奖(Best Paper Award Runner-Up)出炉

人工智能顶会WSDM2021优秀论文奖(Best Paper Award Runner-Up)出炉

专知会员服务

19+阅读 · 2021年2月19日

最新《非光滑优化》十讲硬核课程，剑桥大学梁经纬博士主讲

最新《非光滑优化》十讲硬核课程，剑桥大学梁经纬博士主讲

专知会员服务

34+阅读 · 2020年8月14日

策略梯度方法的算子视图，An operator view of policy gradient methods

策略梯度方法的算子视图，An operator view of policy gradient methods

专知会员服务

11+阅读 · 2020年6月23日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

124+阅读 · 2020年5月30日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

85+阅读 · 2020年2月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Algorithms for Energy Conservation in Heterogeneous Data Centers

Arxiv

0+阅读 · 2021年7月30日

Contemporary Symbolic Regression Methods and their Relative Performance

Arxiv

0+阅读 · 2021年7月29日

A new non-linear instability for scalar fields

A new non-linear instability for scalar fields

Arxiv

0+阅读 · 2021年7月29日

Consensus-Based Optimization on the Sphere: Convergence to Global Minimizers and Machine Learning

Arxiv

0+阅读 · 2021年7月28日

Super-localization of elliptic multiscale problems

Arxiv

0+阅读 · 2021年7月28日

Policy Gradient Methods Find the Nash Equilibrium in N-player General-sum Linear-quadratic Games

Arxiv

0+阅读 · 2021年7月27日

RingBFT: Resilient Consensus over Sharded Ring Topology

Arxiv

0+阅读 · 2021年7月27日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Neural Response Generation with Dynamic Vocabularies

Arxiv

5+阅读 · 2017年11月30日

微信扫码咨询专知VIP会员