MDP 拥挤运动会在线限制满意度 (Online Constraint Satisfaction for MDP Congestion Games) - 专知论文

会员服务 ·

0

优化器 · 极小点 · 约束 · 可约的 · 近似误差 ·

2021 年 3 月 11 日

Online Constraint Satisfaction for MDP Congestion Games

翻译：MDP 拥挤运动会在线限制满意度

Sarah H. Q. Li,Yue Yu,Dan Calderone,Lillian J. Ratliff,Behcet Acikmese

from arxiv, 9 pages, 4 figures

Under the Markov decision process (MDP) congestion game framework, we study the problem of enforcing global constraints using tolls on a population of players with stochastic dynamics and coupled congestion costs. Existing work demonstrates that by enforcing identical tolls on every player, the optimal joint strategy for the playing population can be shifted to satisfy global design constraints. However, computing the minimum tolling value for constraint satisfaction requires explicit modelling of the congestion cost as a function of the playing population. In this paper, we assume that both the playing population and the constraint-enforcing authority, the game designer, lack such a model. Instead, the game designer can enforce tolls on a gaming instance that responds by approximating the optimal joint strategy under any toll. Under these assumptions, we develop a myopic algorithm that enables the game designer to compute the minimum tolling value, and prove that, up to the approximation error made by the gaming instance, our algorithm not only converges to the correct toll, but will guarantee average constraint satisfaction during the iterative process. Finally, we demonstrate how our model and algorithm can be applied to the profit-seeking ride-share driver population of Manhattan, New York City to optimally reduce traffic congestion using tolls.

翻译：在Markov决定(MDP)拥堵游戏框架下,我们研究使用对具有随机动态和同时拥堵成本的玩家人口进行收费来实施全球限制的问题。现有工作表明,通过对每个玩家实施相同的收费,可以改变游戏玩家的最佳联合战略,以满足全球设计限制。然而,计算制约满意度的最低定价值需要明确模拟作为玩家的功能的拥堵成本。在本文中,我们假设玩家人口和约束强制实施当局、游戏设计者都缺乏这样一个模型。相反,游戏设计者可以对一个通过在任何收费下接近最佳联合战略来应对的赌博实例强制实施收费。根据这些假设,我们开发了一种近似算法,使游戏设计者能够计算最小的定价值,并证明,除了游戏游戏游戏游戏游戏场的近似错误之外,我们的算法不仅与正确的收费一致,而且保证在迭接过程中平均的制约满意度。最后,我们展示了我们的模型和算法可以如何将最佳交通流量应用到曼哈顿州的交通。

0

相关内容

优化器

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

机器学习研究会

9+阅读 · 2017年10月24日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach

Arxiv

4+阅读 · 2021年1月19日

Differential Dynamic Programming Neural Optimizer

Arxiv

7+阅读 · 2020年6月29日

Reinforcement Learning Enhanced Quantum-inspired Algorithm for Combinatorial Optimization

Arxiv

4+阅读 · 2020年2月14日

Lipschitz Lifelong Reinforcement Learning

Arxiv

4+阅读 · 2020年1月17日

Information-Directed Exploration for Deep Reinforcement Learning

Information-Directed Exploration for Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年12月18日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Learning to Update for Object Tracking

Arxiv

8+阅读 · 2018年6月19日

Learning Dynamic Memory Networks for Object Tracking

Arxiv

9+阅读 · 2018年3月20日

Safety-aware Adaptive Reinforcement Learning with Applications to Brushbot Navigation

Arxiv

4+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基础模型训练中网络规模数据的负责任与高效使用

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

人工智能时代背景下的未来海战

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

【推荐】树莓派/OpenCV/dlib人脸定位/瞌睡检测

机器学习研究会

9+阅读 · 2017年10月24日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach

Arxiv

4+阅读 · 2021年1月19日

Differential Dynamic Programming Neural Optimizer

Arxiv

7+阅读 · 2020年6月29日

Reinforcement Learning Enhanced Quantum-inspired Algorithm for Combinatorial Optimization

Arxiv

4+阅读 · 2020年2月14日

Lipschitz Lifelong Reinforcement Learning

Arxiv

4+阅读 · 2020年1月17日

Information-Directed Exploration for Deep Reinforcement Learning

Information-Directed Exploration for Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年12月18日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Learning to Update for Object Tracking

Arxiv

8+阅读 · 2018年6月19日

Learning Dynamic Memory Networks for Object Tracking

Arxiv

9+阅读 · 2018年3月20日

Safety-aware Adaptive Reinforcement Learning with Applications to Brushbot Navigation

Arxiv

4+阅读 · 2018年1月29日

微信扫码咨询专知VIP会员