聚集的强盗:通过短期Resets 实现最优行程 (Congested Bandits: Optimal Routing via Short-term Resets) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 上下文赌博机/上下文老虎机 · Bandits · 线性的 · 优化器 ·

2023 年 1 月 23 日

Congested Bandits: Optimal Routing via Short-term Resets

翻译：聚集的强盗:通过短期Resets 实现最优行程

Pranjal Awasthi,Kush Bhatia,Sreenivas Gollapudi,Kostas Kollias

from arxiv, Published at ICML 2022

For traffic routing platforms, the choice of which route to recommend to a user depends on the congestion on these routes -- indeed, an individual's utility depends on the number of people using the recommended route at that instance. Motivated by this, we introduce the problem of Congested Bandits where each arm's reward is allowed to depend on the number of times it was played in the past $\Delta$ timesteps. This dependence on past history of actions leads to a dynamical system where an algorithm's present choices also affect its future pay-offs, and requires an algorithm to plan for this. We study the congestion aware formulation in the multi-armed bandit (MAB) setup and in the contextual bandit setup with linear rewards. For the multi-armed setup, we propose a UCB style algorithm and show that its policy regret scales as $\tilde{O}(\sqrt{K \Delta T})$. For the linear contextual bandit setup, our algorithm, based on an iterative least squares planner, achieves policy regret $\tilde{O}(\sqrt{dT} + \Delta)$. From an experimental standpoint, we corroborate the no-regret properties of our algorithms via a simulation study.

翻译：对于交通路线平台,选择向用户推荐的路线取决于这些路线的堵塞情况。事实上, 个人的效用取决于在此情况下使用推荐路线的人数。受此驱动, 我们引入了Congestested Banits问题, 允许每个手臂的奖赏取决于过去$\ Delta$ 时间步数。对过去行动历史的依赖导致一个动态系统, 算法目前的选择也会影响其未来的回报, 并且需要一种算法来为此进行规划。我们研究多臂土匪(MAB) 设置和背景强盗设置的拥挤意识配方, 并获得线性奖赏。对于多臂的设置, 我们提出一个UCB风格算法, 并显示其政策后悔等级为$tilde{O}( sqrt{K ktta} ) 。对于线性背景土匪设置, 我们的算法, 以迭接式最小平方平方图设计师为基础, 实现政策上 $\ tdede{ {g\\\\\ drtqal adal devoal sal sq) a advial develop viewal develop as.

0

相关内容

赌博机/老虎机

赌博机/老虎机

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

CD177在ANCA相关小血管炎发病机制中的作用

国家自然科学基金

0+阅读 · 2015年12月31日

发射可调铂(II)配合物的设计和新型静电喷雾沉积电致发光器件的制备

国家自然科学基金

0+阅读 · 2015年12月31日

Wnt11/Ca2+信号通路下SPA对BM-MSCs成骨分化的影响及其在感染性骨不连中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

证券市场中ARA的建模算法及实证研究

国家自然科学基金

1+阅读 · 2012年12月31日

鸡传染性支气管炎病毒S蛋白介导的细胞侵染机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

荧光共轭聚合物单分子膜的制备及其对硝基芳烃的传感性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

不同环境下的公钥加密算法设计与可证安全研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于我国典型湿地沉积物剖面的重金属沉积通量长期演变及源解析研究

国家自然科学基金

0+阅读 · 2011年12月31日

PTEN-PI3K-Akt信号通路及其下游基因FOXO3A在卵巢早衰发病中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits

Arxiv

0+阅读 · 2023年3月14日

Learning Adaptable Risk-Sensitive Policies to Coordinate in Multi-Agent General-Sum Games

Arxiv

0+阅读 · 2023年3月14日

Machine Learning Changes the Rules for Flux Limiters

Arxiv

0+阅读 · 2023年3月14日

Combinatorial Pure Exploration of Causal Bandits

Arxiv

0+阅读 · 2023年3月14日

Towards Unsupervised Learning based Denoising of Cyber Physical System Data to Mitigate Security Concerns

Arxiv

0+阅读 · 2023年3月13日

Non-intrusive reduced-order models for parametric partial differential equations via data-driven operator inference

Arxiv

0+阅读 · 2023年3月13日

Parametric Estimation of Tempered Stable Laws

Arxiv

0+阅读 · 2023年3月13日

Best of Many Worlds Guarantees for Online Learning with Knapsacks

Arxiv

0+阅读 · 2023年3月10日

Joint Optimization of Energy Consumption and Completion Time in Federated Learning

Arxiv

0+阅读 · 2023年3月10日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

VIP会员

文章信息

相关主题

赌博机/老虎机

上下文赌博机/上下文老虎机

相关VIP内容

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits

Arxiv

0+阅读 · 2023年3月14日

Learning Adaptable Risk-Sensitive Policies to Coordinate in Multi-Agent General-Sum Games

Arxiv

0+阅读 · 2023年3月14日

Machine Learning Changes the Rules for Flux Limiters

Arxiv

0+阅读 · 2023年3月14日

Combinatorial Pure Exploration of Causal Bandits

Arxiv

0+阅读 · 2023年3月14日

Towards Unsupervised Learning based Denoising of Cyber Physical System Data to Mitigate Security Concerns

Arxiv

0+阅读 · 2023年3月13日

Non-intrusive reduced-order models for parametric partial differential equations via data-driven operator inference

Arxiv

0+阅读 · 2023年3月13日

Parametric Estimation of Tempered Stable Laws

Arxiv

0+阅读 · 2023年3月13日

Best of Many Worlds Guarantees for Online Learning with Knapsacks

Arxiv

0+阅读 · 2023年3月10日

Joint Optimization of Energy Consumption and Completion Time in Federated Learning

Arxiv

0+阅读 · 2023年3月10日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

相关基金

CD177在ANCA相关小血管炎发病机制中的作用

国家自然科学基金

0+阅读 · 2015年12月31日

发射可调铂(II)配合物的设计和新型静电喷雾沉积电致发光器件的制备

国家自然科学基金

0+阅读 · 2015年12月31日

Wnt11/Ca2+信号通路下SPA对BM-MSCs成骨分化的影响及其在感染性骨不连中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

证券市场中ARA的建模算法及实证研究

国家自然科学基金

1+阅读 · 2012年12月31日

鸡传染性支气管炎病毒S蛋白介导的细胞侵染机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

荧光共轭聚合物单分子膜的制备及其对硝基芳烃的传感性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

不同环境下的公钥加密算法设计与可证安全研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于我国典型湿地沉积物剖面的重金属沉积通量长期演变及源解析研究

国家自然科学基金

0+阅读 · 2011年12月31日

PTEN-PI3K-Akt信号通路及其下游基因FOXO3A在卵巢早衰发病中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员