组合制制阻断强力与斯托卡延缓 (Combinatorial Blocking Bandits with Stochastic Delays) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 块 · ARM · 可辨认的 · Bandits ·

2021 年 5 月 22 日

Combinatorial Blocking Bandits with Stochastic Delays

翻译：组合制制阻断强力与斯托卡延缓

Alexia Atsidakou,Orestis Papadigenopoulos,Soumya Basu,Constantine Caramanis,Sanjay Shakkottai

from arxiv, International Conference on Machine Learning, ICML'21

Recent work has considered natural variations of the multi-armed bandit problem, where the reward distribution of each arm is a special function of the time passed since its last pulling. In this direction, a simple (yet widely applicable) model is that of blocking bandits, where an arm becomes unavailable for a deterministic number of rounds after each play. In this work, we extend the above model in two directions: (i) We consider the general combinatorial setting where more than one arms can be played at each round, subject to feasibility constraints. (ii) We allow the blocking time of each arm to be stochastic. We first study the computational/unconditional hardness of the above setting and identify the necessary conditions for the problem to become tractable (even in an approximate sense). Based on these conditions, we provide a tight analysis of the approximation guarantee of a natural greedy heuristic that always plays the maximum expected reward feasible subset among the available (non-blocked) arms. When the arms' expected rewards are unknown, we adapt the above heuristic into a bandit algorithm, based on UCB, for which we provide sublinear (approximate) regret guarantees, matching the theoretical lower bounds in the limiting case of absence of delays.

翻译：最近的工作考虑了多武装土匪问题的自然变化,每个手臂的奖赏分配是自上次拉动以来所经过时间的特殊功能。在这个方向上,一个简单的(目前广泛适用)模式是阻拦土匪的模式,每场比赛后,一个手臂无法用于确定多少轮子。在这项工作中,我们将上述模式分为两个方向:(一) 我们考虑总组合设置,每轮可以玩一个以上的手臂,但需受可行性限制。 (二) 我们允许每个手臂的阻塞时间是随机的。我们首先研究上述设置的计算/无条件硬性,并找出使问题变得易于处理的必要条件(即使是在大致意义上)。根据这些条件,我们对自然贪婪的狂妄主义的近似保证进行了严格分析,这种保证总是在现有的(未封锁的)武器中发挥最大预期的可行分数。当武器预期的奖赏未知时,我们将上面的黑奴主义改成以UCB为基础的匪算法,我们为此提供了次线性延迟的理论保证。

0

相关内容

赌博机/老虎机

赌博机/老虎机

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

《应用随机微分方程》(Applied Stochastic Differential Equations)324页pdf新书分享

《应用随机微分方程》(Applied Stochastic Differential Equations)324页pdf新书分享

专知会员服务

44+阅读 · 2019年10月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

已删除

将门创投

5+阅读 · 2018年11月15日

LibRec 精选：连通知识图谱与推荐系统

LibRec 精选：连通知识图谱与推荐系统

LibRec智能推荐

3+阅读 · 2018年8月9日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Zeroth and First Order Stochastic Frank-Wolfe Algorithms for Constrained Optimization

Arxiv

0+阅读 · 2021年7月14日

Stability and Generalization of Stochastic Gradient Methods for Minimax Problems

Arxiv

0+阅读 · 2021年7月12日

Nonlinear Least Squares for Large-Scale Machine Learning using Stochastic Jacobian Estimates

Arxiv

0+阅读 · 2021年7月12日

A stochastic Gauss-Newton algorithm for regularized semi-discrete optimal transport

Arxiv

0+阅读 · 2021年7月12日

Continuous Time Bandits With Sampling Costs

Arxiv

0+阅读 · 2021年7月12日

Parallel Stochastic Mirror Descent for MDPs

Arxiv

0+阅读 · 2021年7月10日

Optimal Gradient-based Algorithms for Non-concave Bandit Optimization

Arxiv

0+阅读 · 2021年7月9日

Arena-Independent Finite-Memory Determinacy in Stochastic Games

Arxiv

0+阅读 · 2021年7月9日

Low Rank Saddle Free Newton: Scalable Stochastic Nonconvex Optimization

Arxiv

0+阅读 · 2021年7月9日

Variance Reduction Methods for Sublinear Reinforcement Learning

Arxiv

4+阅读 · 2018年4月25日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

《应用随机微分方程》(Applied Stochastic Differential Equations)324页pdf新书分享

《应用随机微分方程》(Applied Stochastic Differential Equations)324页pdf新书分享

专知会员服务

44+阅读 · 2019年10月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

已删除

将门创投

5+阅读 · 2018年11月15日

LibRec 精选：连通知识图谱与推荐系统

LibRec 精选：连通知识图谱与推荐系统

LibRec智能推荐

3+阅读 · 2018年8月9日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Zeroth and First Order Stochastic Frank-Wolfe Algorithms for Constrained Optimization

Arxiv

0+阅读 · 2021年7月14日

Stability and Generalization of Stochastic Gradient Methods for Minimax Problems

Arxiv

0+阅读 · 2021年7月12日

Nonlinear Least Squares for Large-Scale Machine Learning using Stochastic Jacobian Estimates

Arxiv

0+阅读 · 2021年7月12日

A stochastic Gauss-Newton algorithm for regularized semi-discrete optimal transport

Arxiv

0+阅读 · 2021年7月12日

Continuous Time Bandits With Sampling Costs

Arxiv

0+阅读 · 2021年7月12日

Parallel Stochastic Mirror Descent for MDPs

Arxiv

0+阅读 · 2021年7月10日

Optimal Gradient-based Algorithms for Non-concave Bandit Optimization

Arxiv

0+阅读 · 2021年7月9日

Arena-Independent Finite-Memory Determinacy in Stochastic Games

Arxiv

0+阅读 · 2021年7月9日

Low Rank Saddle Free Newton: Scalable Stochastic Nonconvex Optimization

Arxiv

0+阅读 · 2021年7月9日

Variance Reduction Methods for Sublinear Reinforcement Learning

Arxiv

4+阅读 · 2018年4月25日

微信扫码咨询专知VIP会员