学习与非中色剂在斯塔克尔贝格运动会中的学习 (Learning in Stackelberg Games with Non-myopic Agents) - 专知论文

会员服务 ·

0

Agent · Learning · 赌博机/老虎机 · 讲稿 · 稳健性 ·

2022 年 8 月 19 日

Learning in Stackelberg Games with Non-myopic Agents

翻译：学习与非中色剂在斯塔克尔贝格运动会中的学习

Nika Haghtalab,Thodoris Lykouris,Sloan Nietert,Alex Wei

from arxiv, An extended abstract of this work appeared at the ACM Conference on Economics and Computation (EC) 2022

We study Stackelberg games where a principal repeatedly interacts with a long-lived, non-myopic agent, without knowing the agent's payoff function. Although learning in Stackelberg games is well-understood when the agent is myopic, non-myopic agents pose additional complications. In particular, non-myopic agents may strategically select actions that are inferior in the present to mislead the principal's learning algorithm and obtain better outcomes in the future. We provide a general framework that reduces learning in presence of non-myopic agents to robust bandit optimization in the presence of myopic agents. Through the design and analysis of minimally reactive bandit algorithms, our reduction trades off the statistical efficiency of the principal's learning algorithm against its effectiveness in inducing near-best-responses. We apply this framework to Stackelberg security games (SSGs), pricing with unknown demand curve, strategic classification, and general finite Stackelberg games. In each setting, we characterize the type and impact of misspecifications present in near-best-responses and develop a learning algorithm robust to such misspecifications. Along the way, we improve the query complexity of learning in SSGs with $n$ targets from the state-of-the-art $O(n^3)$ to a near-optimal $\widetilde{O}(n)$ by uncovering a fundamental structural property of such games. This result is of independent interest beyond learning with non-myopic agents.

翻译：我们研究Stackelberg游戏,其中一位校长反复与一个长期的、非海洋的代理人互动,而不知道代理人的回报功能。虽然在Stackelberg游戏中学习是完全理解的,但当代理人是短视时,非海洋的代理人就会产生更多的复杂问题。特别是,非海洋代理人可以从战略上选择一些在目前低级的行动,以误导校长的学习算法,并在将来获得更好的结果。我们提供了一个总体框架,减少在非海洋代理人在场的情况下学习,以便在有近视代理人在场的情况下,实现强力的土匪优化。通过设计和分析最低限度反应的土匪算法,我们减少该代理人学习算法的统计效率,以接近最佳的反应为代价。我们把这个框架应用到Sckelberg安全游戏(SSG),以未知的需求曲线定价、战略分类和一般的有限Skelberg游戏。在每次设定时,我们用近最佳反应中的不精确度来描述现有不精确的类型和影响,并发展一种不精确的基调的代理人的基调算法,我们从Sq 3 的深度学习结果改进了Sqour 美元。

0

相关内容

Agent

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Volterra积分微分方程的多区间Chebyshev和Legendre谱配置法

国家自然科学基金

0+阅读 · 2015年12月31日

拟南芥MTERF家族蛋白调控叶绿体基因转录终止的分子机制

国家自然科学基金

1+阅读 · 2014年12月31日

具有输入时滞的柔性结构系统时滞辨识及自适应控制研究

国家自然科学基金

0+阅读 · 2014年12月31日

复杂应用环境影响下光伏阵列MPPT的控制与定量评价方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

内蒙古平原灌区春玉米土壤-根系-水分系统对深松响应的节水高产生理机制

国家自然科学基金

0+阅读 · 2012年12月31日

多智能体系统分布式最优化问题

国家自然科学基金

9+阅读 · 2012年12月31日

饱和非线性奇异系统基于Hamilton函数的控制方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Narf影响细胞衰老的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

SARI基因在肺癌侵袭转移中的作用及分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

基于可再生能源的分布式发电系统能量变换、控制与并网运行研究

国家自然科学基金

0+阅读 · 2008年12月31日

A Game-Theoretic Perspective of Generalization in Reinforcement Learning

Arxiv

0+阅读 · 2022年10月6日

Learning of Dynamical Systems under Adversarial Attacks -- Null Space Property Perspective

Learning of Dynamical Systems under Adversarial Attacks -- Null Space Property Perspective

Arxiv

0+阅读 · 2022年10月5日

Game Theoretic Rating in N-player general-sum games with Equilibria

Arxiv

0+阅读 · 2022年10月5日

DISCOVER: Deep identification of symbolic open-form PDEs via enhanced reinforcement-learning

Arxiv

0+阅读 · 2022年10月4日

Inverse Game Theory for Stackelberg Games: the Blessing of Bounded Rationality

Arxiv

0+阅读 · 2022年10月4日

Improving Robustness of Deep Reinforcement Learning Agents: Environment Attack based on the Critic Network

Arxiv

0+阅读 · 2022年10月3日

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

Arxiv

1+阅读 · 2022年9月30日

Observational Robustness and Invariances in Reinforcement Learning via Lexicographic Objectives

Arxiv

0+阅读 · 2022年9月30日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

发射器定位中的传感器路径规划研究 | 235页

战略无人机 | 2025最新80页

蜂窝通信是否是无人机与无人地面战车主宰战场的关键？

无人机对机动战的影响 | 2025最新文献

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A Game-Theoretic Perspective of Generalization in Reinforcement Learning

Arxiv

0+阅读 · 2022年10月6日

Learning of Dynamical Systems under Adversarial Attacks -- Null Space Property Perspective

Learning of Dynamical Systems under Adversarial Attacks -- Null Space Property Perspective

Arxiv

0+阅读 · 2022年10月5日

Game Theoretic Rating in N-player general-sum games with Equilibria

Arxiv

0+阅读 · 2022年10月5日

DISCOVER: Deep identification of symbolic open-form PDEs via enhanced reinforcement-learning

Arxiv

0+阅读 · 2022年10月4日

Inverse Game Theory for Stackelberg Games: the Blessing of Bounded Rationality

Arxiv

0+阅读 · 2022年10月4日

Improving Robustness of Deep Reinforcement Learning Agents: Environment Attack based on the Critic Network

Arxiv

0+阅读 · 2022年10月3日

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

Arxiv

1+阅读 · 2022年9月30日

Observational Robustness and Invariances in Reinforcement Learning via Lexicographic Objectives

Arxiv

0+阅读 · 2022年9月30日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

相关基金

Volterra积分微分方程的多区间Chebyshev和Legendre谱配置法

国家自然科学基金

0+阅读 · 2015年12月31日

拟南芥MTERF家族蛋白调控叶绿体基因转录终止的分子机制

国家自然科学基金

1+阅读 · 2014年12月31日

具有输入时滞的柔性结构系统时滞辨识及自适应控制研究

国家自然科学基金

0+阅读 · 2014年12月31日

复杂应用环境影响下光伏阵列MPPT的控制与定量评价方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

内蒙古平原灌区春玉米土壤-根系-水分系统对深松响应的节水高产生理机制

国家自然科学基金

0+阅读 · 2012年12月31日

多智能体系统分布式最优化问题

国家自然科学基金

9+阅读 · 2012年12月31日

饱和非线性奇异系统基于Hamilton函数的控制方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Narf影响细胞衰老的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

SARI基因在肺癌侵袭转移中的作用及分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

基于可再生能源的分布式发电系统能量变换、控制与并网运行研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员