在重复游戏游戏中以动态基准尽量减少遗憾 (Regret Minimization with Dynamic Benchmarks in Repeated Games) - 专知论文

会员服务 ·

0

Performer · 情景 · 联合分布 · 相关系数 · 缩放 ·

2022 年 12 月 6 日

Regret Minimization with Dynamic Benchmarks in Repeated Games

翻译：在重复游戏游戏中以动态基准尽量减少遗憾

Ludovico Crippa,Yonatan Gur,Bar Light

In repeated games, strategies are often evaluated by their ability to guarantee the performance of the single best action that is selected in hindsight (a property referred to as \emph{Hannan consistency}, or \emph{no-regret}). However, the effectiveness of the single best action as a yardstick to evaluate strategies is limited, as any static action may perform poorly in common dynamic settings. We propose the notion of \emph{dynamic benchmark consistency}, which requires a strategy to asymptotically guarantee the performance of the best \emph{dynamic} sequence of actions selected in hindsight subject to a constraint on the number of action changes the corresponding dynamic benchmark admits. We show that dynamic benchmark consistent strategies exist if and only if the number of changes in the benchmark scales sublinearly with the horizon length. Further, our main result establishes that the set of empirical joint distributions of play that may emerge, when all players deploy such strategies, asymptotically coincides with the set of \emph{Hannan equilibria} (also referred to as \emph{coarse correlated equilibria}) of the stage game. This general characterization allows one to leverage analyses developed for frameworks that consider static benchmarks, which we demonstrate by bounding the social efficiency of the possible outcomes in our~setting. Together, our results imply that dynamic benchmark consistent strategies introduce the following \emph{Pareto-type} improvement over no-regret strategies: They enable stronger individual guarantees against arbitrary strategies of the other players, while maintaining the same worst-case guarantees on the social welfare, when all players adopt these strategies.

翻译：在重复的游戏中,战略往往以其能力来评价,以保证在事后观察中选择的单一最佳行动的性能(一种称为 emph{Hanann一致性} 或\ emph{no-regret} 的属性)。然而,作为评价战略的尺度,单一最佳行动的效力是有限的,因为在共同的动态环境中,任何静态行动都可能表现不佳。我们建议了一个概念,它需要一种战略来保证在事后观察中选择的、在事后观察中选择的、以行动数量限制改变相应的动态基准承认的单一最佳行动的性能。然而,我们表明,只有在基准尺度与地平面长度相比的变化数量有限的情况下,才存在动态基准一致的战略。此外,我们的主要结果确定,当所有参与者部署这种战略时,所有参与者都以静态的方式, 使个人战略的改进得以实现。(同样地) 也是指,在对相应的行动基准进行任意性保证时,当我们开始一个稳定的游戏的基底值分析时, 当我们的社会基值能够显示我们的社会基值分析时, 的基数级结果。

0

相关内容

Performer

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

拓扑绝缘体与超导体耦合体系中交叉Andreev反射研究

国家自然科学基金

1+阅读 · 2014年12月31日

A1AR保护糖尿病肾小管周微环境的非管球反馈机制

国家自然科学基金

0+阅读 · 2014年12月31日

靶向免疫治疗与靶向化疗新制剂的抗肿瘤协同作用

国家自然科学基金

0+阅读 · 2014年12月31日

AlCrN/SiNx纳米多层涂层刀具的超晶格结构与高温特性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于糖化合物“Ferrier Carbocyclization”汞离子荧光探针的设计、合成及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

MG132通过上调Nrf2/ARE信号通路治疗糖尿病肾病的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Keap1-Nrf2-ARE信号通路在花色苷诱导HO-1mRNA表达及抗氧化损伤中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

小电导Ca2+激活K+通道与ryanodine受体功能性偶联的研究

国家自然科学基金

0+阅读 · 2008年12月31日

On the (Im)Possibility of Estimating Various Notions of Differential Privacy

On the (Im)Possibility of Estimating Various Notions of Differential Privacy

Arxiv

0+阅读 · 2023年2月7日

Population-size-Aware Policy Optimization for Mean-Field Games

Arxiv

0+阅读 · 2023年2月7日

Domain Adaptation for Time Series Under Feature and Label Shifts

Arxiv

6+阅读 · 2023年2月6日

Asymptotically Minimax Optimal Fixed-Budget Best Arm Identification for Expected Simple Regret Minimization

Arxiv

0+阅读 · 2023年2月6日

Policy-Value Alignment and Robustness in Search-based Multi-Agent Learning

Arxiv

0+阅读 · 2023年2月6日

Guide the Learner: Controlling Product of Experts Debiasing Method Based on Token Attribution Similarities

Arxiv

0+阅读 · 2023年2月6日

Dealing With Non-stationarity in Decentralized Cooperative Multi-Agent Deep Reinforcement Learning via Multi-Timescale Learning

Arxiv

0+阅读 · 2023年2月6日

Learning Players' Objectives in Continuous Dynamic Games from Partial State Observations

Arxiv

0+阅读 · 2023年2月3日

Characterization and estimation of high dimensional sparse regression parameters under linear inequality constraints

Arxiv

0+阅读 · 2023年2月3日

Fast Feature Selection with Fairness Constraints

Arxiv

0+阅读 · 2023年2月3日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

兵棋系统文档：联合战区级模拟-全球行动（JTLS-GO®）

【普林斯顿博士论文】面向人本机器人学的安全与学习博弈论融合

从无人机到数据：揭示边缘计算作为新作战域

综述：机器嗅觉与嵌入式人工智能正在塑造新的全球传感产业

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

On the (Im)Possibility of Estimating Various Notions of Differential Privacy

On the (Im)Possibility of Estimating Various Notions of Differential Privacy

Arxiv

0+阅读 · 2023年2月7日

Population-size-Aware Policy Optimization for Mean-Field Games

Arxiv

0+阅读 · 2023年2月7日

Domain Adaptation for Time Series Under Feature and Label Shifts

Arxiv

6+阅读 · 2023年2月6日

Asymptotically Minimax Optimal Fixed-Budget Best Arm Identification for Expected Simple Regret Minimization

Arxiv

0+阅读 · 2023年2月6日

Policy-Value Alignment and Robustness in Search-based Multi-Agent Learning

Arxiv

0+阅读 · 2023年2月6日

Guide the Learner: Controlling Product of Experts Debiasing Method Based on Token Attribution Similarities

Arxiv

0+阅读 · 2023年2月6日

Dealing With Non-stationarity in Decentralized Cooperative Multi-Agent Deep Reinforcement Learning via Multi-Timescale Learning

Arxiv

0+阅读 · 2023年2月6日

Learning Players' Objectives in Continuous Dynamic Games from Partial State Observations

Arxiv

0+阅读 · 2023年2月3日

Characterization and estimation of high dimensional sparse regression parameters under linear inequality constraints

Arxiv

0+阅读 · 2023年2月3日

Fast Feature Selection with Fairness Constraints

Arxiv

0+阅读 · 2023年2月3日

相关基金

拓扑绝缘体与超导体耦合体系中交叉Andreev反射研究

国家自然科学基金

1+阅读 · 2014年12月31日

A1AR保护糖尿病肾小管周微环境的非管球反馈机制

国家自然科学基金

0+阅读 · 2014年12月31日

靶向免疫治疗与靶向化疗新制剂的抗肿瘤协同作用

国家自然科学基金

0+阅读 · 2014年12月31日

AlCrN/SiNx纳米多层涂层刀具的超晶格结构与高温特性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于糖化合物“Ferrier Carbocyclization”汞离子荧光探针的设计、合成及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

MG132通过上调Nrf2/ARE信号通路治疗糖尿病肾病的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Keap1-Nrf2-ARE信号通路在花色苷诱导HO-1mRNA表达及抗氧化损伤中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

小电导Ca2+激活K+通道与ryanodine受体功能性偶联的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员