Repeated Principal-Agent Games with Unobserved Agent Rewards and Perfect-Knowledge Agents - 专知论文

会员服务 ·

0

Agent · 赌博机/老虎机 · 估计/估计量 · ARM · 设计 ·

2023 年 5 月 7 日

Repeated Principal-Agent Games with Unobserved Agent Rewards and Perfect-Knowledge Agents

翻译：暂无翻译

Ilgin Dogan,Zuo-Jun Max Shen,Anil Aswani

from arxiv, 50 pages, 4 figures

Motivated by a number of real-world applications from domains like healthcare and sustainable transportation, in this paper we study a scenario of repeated principal-agent games within a multi-armed bandit (MAB) framework, where: the principal gives a different incentive for each bandit arm, the agent picks a bandit arm to maximize its own expected reward plus incentive, and the principal observes which arm is chosen and receives a reward (different than that of the agent) for the chosen arm. Designing policies for the principal is challenging because the principal cannot directly observe the reward that the agent receives for their chosen actions, and so the principal cannot directly learn the expected reward using existing estimation techniques. As a result, the problem of designing policies for this scenario, as well as similar ones, remains mostly unexplored. In this paper, we construct a policy that achieves a low regret (i.e., square-root regret up to a log factor) in this scenario for the case where the agent has perfect-knowledge about its own expected rewards for each bandit arm. We design our policy by first constructing an estimator for the agent's expected reward for each bandit arm. Since our estimator uses as data the sequence of incentives offered and subsequently chosen arms, the principal's estimation can be regarded as an analogy of online inverse optimization in MAB's. Next we construct a policy that we prove achieves a low regret by deriving finite-sample concentration bounds for our estimator. We conclude with numerical simulations demonstrating the applicability of our policy to real-life setting from collaborative transportation planning.

翻译：暂无翻译

0

相关内容

Agent

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

非线性Schrödinger方程孤立子和怪波的数值方法

国家自然科学基金

0+阅读 · 2015年12月31日

靶向MEF2C/HDACs相互作用小分子化合物CC1007抗急性淋巴细胞白血病的效应及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

非线性算子正解与数值解及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

基于r/K策略的微生态相容机制及生物添加强化硝化调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

异源small RNA调控芸薹属蔬菜遗传变异发生的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

大型分片线性方程组的数值方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

百日咳博德特氏菌血清抗性蛋白BrkA跨膜结构域的晶体学研究

国家自然科学基金

0+阅读 · 2009年12月31日

非光滑集值优化理论及其应用研究

国家自然科学基金

0+阅读 · 2009年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

Nrf2在肿瘤耐药中的作用及其机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

Differentially-private Distributed Algorithms for Aggregative Games with Guaranteed Convergence

Arxiv

0+阅读 · 2023年6月22日

DreamEdit: Subject-driven Image Editing

Arxiv

0+阅读 · 2023年6月22日

Spatial Heterophily Aware Graph Neural Networks

Arxiv

0+阅读 · 2023年6月21日

Design and Validation of a Bimanual Haptic Epidural Needle Insertion Simulator

Arxiv

0+阅读 · 2023年6月21日

On the Convergence and Calibration of Deep Learning with Differential Privacy

Arxiv

0+阅读 · 2023年6月19日

Static and Dynamic Jamming Games Over Wireless Channels With Mobile Strategic Players

Arxiv

0+阅读 · 2023年6月19日

AdaStop: sequential testing for efficient and reliable comparisons of Deep RL Agents

Arxiv

0+阅读 · 2023年6月19日

Can predictive models be used for causal inference?

Arxiv

0+阅读 · 2023年6月18日

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Arxiv

18+阅读 · 2021年6月17日

Distributed Graph Convolutional Networks

Arxiv

19+阅读 · 2020年7月13日

VIP会员

文章信息

相关主题

赌博机/老虎机

估计/估计量

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《2024年度美国防部作战测试与评估报告》500页

《面相未来作战空中系统中有人-无人编组的AI驱动协作模式选择》含slides

无人机编队飞行：复杂环境中作战的策略、挑战与应用

《探索军事背景下共享大语言模型：AI助手与智能体部署中可扩展性与效率的早期洞察》（含44页slides）

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Differentially-private Distributed Algorithms for Aggregative Games with Guaranteed Convergence

Arxiv

0+阅读 · 2023年6月22日

DreamEdit: Subject-driven Image Editing

Arxiv

0+阅读 · 2023年6月22日

Spatial Heterophily Aware Graph Neural Networks

Arxiv

0+阅读 · 2023年6月21日

Design and Validation of a Bimanual Haptic Epidural Needle Insertion Simulator

Arxiv

0+阅读 · 2023年6月21日

On the Convergence and Calibration of Deep Learning with Differential Privacy

Arxiv

0+阅读 · 2023年6月19日

Static and Dynamic Jamming Games Over Wireless Channels With Mobile Strategic Players

Arxiv

0+阅读 · 2023年6月19日

AdaStop: sequential testing for efficient and reliable comparisons of Deep RL Agents

Arxiv

0+阅读 · 2023年6月19日

Can predictive models be used for causal inference?

Arxiv

0+阅读 · 2023年6月18日

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Arxiv

18+阅读 · 2021年6月17日

Distributed Graph Convolutional Networks

Arxiv

19+阅读 · 2020年7月13日

相关基金

非线性Schrödinger方程孤立子和怪波的数值方法

国家自然科学基金

0+阅读 · 2015年12月31日

靶向MEF2C/HDACs相互作用小分子化合物CC1007抗急性淋巴细胞白血病的效应及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

非线性算子正解与数值解及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

基于r/K策略的微生态相容机制及生物添加强化硝化调控研究

国家自然科学基金

0+阅读 · 2012年12月31日

异源small RNA调控芸薹属蔬菜遗传变异发生的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

大型分片线性方程组的数值方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

百日咳博德特氏菌血清抗性蛋白BrkA跨膜结构域的晶体学研究

国家自然科学基金

0+阅读 · 2009年12月31日

非光滑集值优化理论及其应用研究

国家自然科学基金

0+阅读 · 2009年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

Nrf2在肿瘤耐药中的作用及其机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员