有限地平线和串串串无无无无无无无无的多武装盗匪问题高效算法 (Efficient Algorithms for Finite Horizon and Streaming Restless Multi-Armed Bandit Problems) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 流 · CC · 泛化理论 · Performance ·

2021 年 3 月 8 日

Efficient Algorithms for Finite Horizon and Streaming Restless Multi-Armed Bandit Problems

翻译：有限地平线和串串串无无无无无无无无的多武装盗匪问题高效算法

Aditya Mate,Arpita Biswas,Christoph Siebenbrunner,Milind Tambe

Restless Multi-Armed Bandits (RMABs) have been popularly used to model limited resource allocation problems. Recently, these have been employed for health monitoring and intervention planning problems. However, the existing approaches fail to account for the arrival of new patients and the departure of enrolled patients from a treatment program. To address this challenge, we formulate a streaming bandit (S-RMAB) framework, a generalization of RMABs where heterogeneous arms arrive and leave under possibly random streams. We propose a new and scalable approach to computing index-based solutions. We start by proving that index values decrease for short residual lifetimes, a phenomenon that we call index decay. We then provide algorithms designed to capture index decay without having to solve the costly finite horizon problem, thereby lowering the computational complexity compared to existing methods.We evaluate our approach via simulations run on real-world data obtained from a tuberculosis intervention planning task as well as multiple other synthetic domains. Our algorithms achieve an over 150x speed-up over existing methods in these tasks without loss in performance. These findings are robust across multiple domains.

翻译：多装甲猛匪(RMABs)被广泛用来模拟有限的资源分配问题。最近,这些被用于健康监测和干预规划问题。然而,现有办法没有考虑到新病人的到来和注册病人离开治疗方案的情况。为了应对这一挑战,我们制定了一条流式强盗(S-RMAB)框架,在混杂武器到达并离开可能随机流的情况下,对RMABs进行概括化。我们提出了一种新的和可扩展的方法来计算基于指数的解决办法。我们首先证明指数值在短的剩余寿命期间会下降,这是一种我们称之为指数衰变的现象。我们随后提供了旨在捕捉指数衰败的算法,而不必解决昂贵的有限地平线问题,从而降低计算的复杂性。我们通过模拟从结核病干预规划任务和其他多种合成领域获得的实际世界数据来评估我们的方法。我们的算法在不造成绩效损失的情况下,在这些任务的现有方法上取得了150x速度的加速率。这些发现在多个领域是稳健的。

0

相关内容

赌博机/老虎机

赌博机/老虎机

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【SIGIR2020】多检索系统的贝叶斯推理风险评估，Bayesian Inferential Risk Evaluation On Multiple IR Systems

【SIGIR2020】多检索系统的贝叶斯推理风险评估，Bayesian Inferential Risk Evaluation On Multiple IR Systems

专知会员服务

9+阅读 · 2020年6月10日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Efficient algorithms for electric vehicles' min-max routing problem

Arxiv

0+阅读 · 2021年4月30日

On the bias, risk and consistency of sample means in multi-armed bandits

Arxiv

0+阅读 · 2021年4月30日

Tree boosting for learning probability measures

Arxiv

0+阅读 · 2021年4月29日

ReLearn: A Robust Machine Learning Framework in Presence of Missing Data for Multimodal Stress Detection from Physiological Signals

Arxiv

0+阅读 · 2021年4月29日

Learning Actor-centered Representations for Action Localization in Streaming Videos using Predictive Learning

Arxiv

0+阅读 · 2021年4月29日

Statistical Inference with M-Estimators on Bandit Data

Arxiv

0+阅读 · 2021年4月29日

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Arxiv

1+阅读 · 2021年4月28日

Bandit-Based Monte Carlo Optimization for Nearest Neighbors

Arxiv

0+阅读 · 2021年4月28日

Sparsity-Agnostic Lasso Bandit

Arxiv

0+阅读 · 2021年4月28日

Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate

Arxiv

7+阅读 · 2018年4月24日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【SIGIR2020】多检索系统的贝叶斯推理风险评估，Bayesian Inferential Risk Evaluation On Multiple IR Systems

【SIGIR2020】多检索系统的贝叶斯推理风险评估，Bayesian Inferential Risk Evaluation On Multiple IR Systems

专知会员服务

9+阅读 · 2020年6月10日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Efficient algorithms for electric vehicles' min-max routing problem

Arxiv

0+阅读 · 2021年4月30日

On the bias, risk and consistency of sample means in multi-armed bandits

Arxiv

0+阅读 · 2021年4月30日

Tree boosting for learning probability measures

Arxiv

0+阅读 · 2021年4月29日

ReLearn: A Robust Machine Learning Framework in Presence of Missing Data for Multimodal Stress Detection from Physiological Signals

Arxiv

0+阅读 · 2021年4月29日

Learning Actor-centered Representations for Action Localization in Streaming Videos using Predictive Learning

Arxiv

0+阅读 · 2021年4月29日

Statistical Inference with M-Estimators on Bandit Data

Arxiv

0+阅读 · 2021年4月29日

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Arxiv

1+阅读 · 2021年4月28日

Bandit-Based Monte Carlo Optimization for Nearest Neighbors

Arxiv

0+阅读 · 2021年4月28日

Sparsity-Agnostic Lasso Bandit

Arxiv

0+阅读 · 2021年4月28日

Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate

Arxiv

7+阅读 · 2018年4月24日

微信扫码咨询专知VIP会员