资源配置多武装混合强盗 (Combinatorial Multi-armed Bandits for Resource Allocation) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 样例 · Continuity · 离散化 · Processing（编程语言） ·

2021 年 5 月 10 日

Combinatorial Multi-armed Bandits for Resource Allocation

翻译：资源配置多武装混合强盗

Jinhang Zuo,Carlee Joe-Wong

We study the sequential resource allocation problem where a decision maker repeatedly allocates budgets between resources. Motivating examples include allocating limited computing time or wireless spectrum bands to multiple users (i.e., resources). At each timestep, the decision maker should distribute its available budgets among different resources to maximize the expected reward, or equivalently to minimize the cumulative regret. In doing so, the decision maker should learn the value of the resources allocated for each user from feedback on each user's received reward. For example, users may send messages of different urgency over wireless spectrum bands; the reward generated by allocating spectrum to a user then depends on the message's urgency. We assume each user's reward follows a random process that is initially unknown. We design combinatorial multi-armed bandit algorithms to solve this problem with discrete or continuous budgets. We prove the proposed algorithms achieve logarithmic regrets under semi-bandit feedback.

翻译：我们研究决策者反复在资源之间分配预算的顺序资源分配问题。吸引人的例子包括将有限的计算时间或无线频谱带分配给多个用户( 即资源 ) 。每次时间步骤时, 决策者应将其可用预算分配给不同的资源, 以尽量获得预期的回报, 或同等地将累积的遗憾降到最低。这样做时, 决策者应该从每个用户收到的奖励的反馈中了解分配给每个用户的资源的价值。例如, 用户可以发送无线频谱带的不同紧急信息; 将频谱分配给用户所产生的奖励则取决于信息的迫切性。我们假定每个用户的奖赏都遵循一个最初未知的随机过程。我们设计组合式多臂宽频谱算法, 以独立或连续的预算解决这个问题。我们证明提议的算法在半波段反馈下实现了对数性遗憾。

0

相关内容

赌博机/老虎机

赌博机/老虎机

【WWW2021】反事实排序学习中的鲁棒泛化和安全查询专门化

【WWW2021】反事实排序学习中的鲁棒泛化和安全查询专门化

专知会员服务

7+阅读 · 2021年2月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

五个精彩实用的自然语言处理资源

五个精彩实用的自然语言处理资源

机器学习研究会

6+阅读 · 2018年2月23日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Predictive Dynamic Scaling Multi-Slice-in-Slice-Connected Users for 5G System Resource Scheduling

Arxiv

0+阅读 · 2021年6月29日

Improved Prediction and Network Estimation Using the Monotone Single Index Multi-variate Autoregressive Model

Arxiv

0+阅读 · 2021年6月29日

Approximately Envy-Free Budget-Feasible Allocation

Arxiv

0+阅读 · 2021年6月28日

Regret Analysis in Deterministic Reinforcement Learning

Arxiv

0+阅读 · 2021年6月27日

Optimal Algorithm Allocation for Single Robot Cloud Systems

Arxiv

0+阅读 · 2021年6月27日

Online Multi-Armed Bandits with Adaptive Inference

Arxiv

0+阅读 · 2021年6月27日

Fairness-Aware Caching and Radio Resource Allocation for the Downlink of Multi-Cell OFDMA Systems

Arxiv

0+阅读 · 2021年6月26日

Multi-player Multi-armed Bandits with Collision-Dependent Reward Distributions

Multi-player Multi-armed Bandits with Collision-Dependent Reward Distributions

Arxiv

0+阅读 · 2021年6月25日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Arxiv

5+阅读 · 2020年4月2日

Cache-Enabled Dynamic Rate Allocation via Deep Self-Transfer Reinforcement Learning

Arxiv

4+阅读 · 2018年3月30日

VIP会员

文章信息

相关主题

赌博机/老虎机

Processing（编程语言）

相关VIP内容

【WWW2021】反事实排序学习中的鲁棒泛化和安全查询专门化

【WWW2021】反事实排序学习中的鲁棒泛化和安全查询专门化

专知会员服务

7+阅读 · 2021年2月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机战争时代的战时法：大国竞争中的区分原则、相称性原则与行动建议》最新75页

《构建强健军事力量的设计挑战：提升海军兵力支持系统效能的多分辨率建模方法》69页

正视无人机心理战：恐惧效应与战略反思

《精确反蜂群防御系统：三维运动探测与定向空爆拦截技术融合》最新24页

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

五个精彩实用的自然语言处理资源

五个精彩实用的自然语言处理资源

机器学习研究会

6+阅读 · 2018年2月23日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Predictive Dynamic Scaling Multi-Slice-in-Slice-Connected Users for 5G System Resource Scheduling

Arxiv

0+阅读 · 2021年6月29日

Improved Prediction and Network Estimation Using the Monotone Single Index Multi-variate Autoregressive Model

Arxiv

0+阅读 · 2021年6月29日

Approximately Envy-Free Budget-Feasible Allocation

Arxiv

0+阅读 · 2021年6月28日

Regret Analysis in Deterministic Reinforcement Learning

Arxiv

0+阅读 · 2021年6月27日

Optimal Algorithm Allocation for Single Robot Cloud Systems

Arxiv

0+阅读 · 2021年6月27日

Online Multi-Armed Bandits with Adaptive Inference

Arxiv

0+阅读 · 2021年6月27日

Fairness-Aware Caching and Radio Resource Allocation for the Downlink of Multi-Cell OFDMA Systems

Arxiv

0+阅读 · 2021年6月26日

Multi-player Multi-armed Bandits with Collision-Dependent Reward Distributions

Multi-player Multi-armed Bandits with Collision-Dependent Reward Distributions

Arxiv

0+阅读 · 2021年6月25日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Arxiv

5+阅读 · 2020年4月2日

Cache-Enabled Dynamic Rate Allocation via Deep Self-Transfer Reinforcement Learning

Arxiv

4+阅读 · 2018年3月30日

微信扫码咨询专知VIP会员