VCG:通过离线强化学习学习动态机制设计 (Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning) - 专知论文

会员服务 ·

0

学成 · INTERACT · 泛函 · 情景 · 强化学习 ·

2022 年 5 月 5 日

Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning

翻译：VCG:通过离线强化学习学习动态机制设计

Boxiang Lyu,Zhaoran Wang,Mladen Kolar,Zhuoran Yang

Dynamic mechanism design has garnered significant attention from both computer scientists and economists in recent years. By allowing agents to interact with the seller over multiple rounds, where agents' reward functions may change with time and are state dependent, the framework is able to model a rich class of real world problems. In these works, the interaction between agents and sellers are often assumed to follow a Markov Decision Process (MDP). We focus on the setting where the reward and transition functions of such an MDP are not known a priori, and we are attempting to recover the optimal mechanism using an a priori collected data set. In the setting where the function approximation is employed to handle large state spaces, with only mild assumptions on the expressiveness of the function class, we are able to design a dynamic mechanism using offline reinforcement learning algorithms. Moreover, learned mechanisms approximately have three key desiderata: efficiency, individual rationality, and truthfulness. Our algorithm is based on the pessimism principle and only requires a mild assumption on the coverage of the offline data set. To the best of our knowledge, our work provides the first offline RL algorithm for dynamic mechanism design without assuming uniform coverage.

翻译：近年来,计算机科学家和经济学家都非常关注动态机制设计。通过允许代理商与卖方进行多轮互动,使代理商的奖励功能随时间变化而变化,并取决于国家,该框架能够模拟一大批真实世界问题。在这些工程中,代理商和卖方之间的互动通常假定遵循Markov决策程序(MDP 程序 ) 。我们侧重于这样一个MDP的奖赏和过渡功能不先验的设定,我们正试图利用先验收集的数据集恢复最佳机制。在使用功能近似处理大州空间的设置中,我们只能对功能类的清晰度进行轻微的假设,我们能够设计出一个动态机制,使用离线强化学习算法。此外,学习机制大概有三个关键偏差:效率、个人理性和真实性。我们的算法基于悲观原则,只需要对离线数据集的覆盖进行温和的假设。为了我们的最佳知识,我们的工作为动态机制的设计提供了第一个离线 RL 算法,而没有假设统一的覆盖。

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

受体MDSCs通过CEACAM1-TIM3调控NK细胞功能介导肝移植免疫耐受的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

e-Learner认知效率建模及自适应调整方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

中国田鼠亚科 Microtini族(Rodentia: Cricetidae: Arvicolinae)的分类与系统发育研究

国家自然科学基金

0+阅读 · 2014年12月31日

暖白光LED用低光衰高显色性Lu3Al5-x(Si/B)xO12-yNy:Ce荧光粉的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于基因-蛋白质-代谢物调控网络的极端微生物耐辐射分子机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

BAFF干扰的树突状细胞参与自身免疫性关节炎免疫耐受的作用和机制

国家自然科学基金

0+阅读 · 2012年12月31日

MFHAS1通过ERK信号转导通路对脓毒症小鼠T淋巴细胞的作用及机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

不同基因型（p53codon72）鼻咽癌细胞放射敏感性差异的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

BRR2蛋白突变导致视网膜色素变性发病机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

ROS信号通路与AsA参与棉花纤维细胞发育的作用机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

Bilateral Deep Reinforcement Learning Approach for Better-than-human Car Following Model

Arxiv

0+阅读 · 2022年6月26日

Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems

Arxiv

0+阅读 · 2022年6月24日

Recursive Reinforcement Learning

Arxiv

0+阅读 · 2022年6月23日

PAC: Assisted Value Factorisation with Counterfactual Predictions in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年6月22日

Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer

Arxiv

0+阅读 · 2022年6月22日

Bayesian Nonparametrics for Offline Skill Discovery

Arxiv

0+阅读 · 2022年6月22日

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

Arxiv

17+阅读 · 2020年4月28日

Reinforced Negative Sampling over Knowledge Graph for Recommendation

Arxiv

17+阅读 · 2020年3月12日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】以人为中心的强化学习

任务规划与地形分析：现代复杂环境作战导航体系

认知优势：人工智能在国家安全决策中的核心作用

大模型赋能的具身智能：决策与具身学习综述

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Bilateral Deep Reinforcement Learning Approach for Better-than-human Car Following Model

Arxiv

0+阅读 · 2022年6月26日

Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems

Arxiv

0+阅读 · 2022年6月24日

Recursive Reinforcement Learning

Arxiv

0+阅读 · 2022年6月23日

PAC: Assisted Value Factorisation with Counterfactual Predictions in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2022年6月22日

Optimistic Linear Support and Successor Features as a Basis for Optimal Policy Transfer

Arxiv

0+阅读 · 2022年6月22日

Bayesian Nonparametrics for Offline Skill Discovery

Arxiv

0+阅读 · 2022年6月22日

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

Arxiv

17+阅读 · 2020年4月28日

Reinforced Negative Sampling over Knowledge Graph for Recommendation

Arxiv

17+阅读 · 2020年3月12日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

相关基金

受体MDSCs通过CEACAM1-TIM3调控NK细胞功能介导肝移植免疫耐受的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

e-Learner认知效率建模及自适应调整方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

中国田鼠亚科 Microtini族(Rodentia: Cricetidae: Arvicolinae)的分类与系统发育研究

国家自然科学基金

0+阅读 · 2014年12月31日

暖白光LED用低光衰高显色性Lu3Al5-x(Si/B)xO12-yNy:Ce荧光粉的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于基因-蛋白质-代谢物调控网络的极端微生物耐辐射分子机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

BAFF干扰的树突状细胞参与自身免疫性关节炎免疫耐受的作用和机制

国家自然科学基金

0+阅读 · 2012年12月31日

MFHAS1通过ERK信号转导通路对脓毒症小鼠T淋巴细胞的作用及机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

不同基因型（p53codon72）鼻咽癌细胞放射敏感性差异的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

BRR2蛋白突变导致视网膜色素变性发病机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

ROS信号通路与AsA参与棉花纤维细胞发育的作用机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员