具有可分享的有限能力武器的多玩玩家盗贼 (Multiple-Play Stochastic Bandits with Shareable Finite-Capacity Arms) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 样本复杂度 · ARM · Learning · Extensibility ·

2022 年 6 月 17 日

Multiple-Play Stochastic Bandits with Shareable Finite-Capacity Arms

翻译：具有可分享的有限能力武器的多玩玩家盗贼

Xuchuang Wang,Hong Xie,John C. S. Lui

from arxiv, to appear in ICML 2022

We generalize the multiple-play multi-armed bandits (MP-MAB) problem with a shareable arm setting, in which several plays can share the same arm. Furthermore, each shareable arm has a finite reward capacity and a ''per-load'' reward distribution, both of which are unknown to the learner. The reward from a shareable arm is load-dependent, which is the "per-load" reward multiplying either the number of plays pulling the arm, or its reward capacity when the number of plays exceeds the capacity limit. When the "per-load" reward follows a Gaussian distribution, we prove a sample complexity lower bound of learning the capacity from load-dependent rewards and also a regret lower bound of this new MP-MAB problem. We devise a capacity estimator whose sample complexity upper bound matches the lower bound in terms of reward means and capacities. We also propose an online learning algorithm to address the problem and prove its regret upper bound. This regret upper bound's first term is the same as regret lower bound's, and its second and third terms also evidently correspond to lower bound's. Extensive experiments validate our algorithm's performance and also its gain in 5G & 4G base station selection.

翻译：我们将多重玩耍的多武装强盗(MP-MAB)问题与可分享的手臂设置(MP-MAB)问题普遍化,其中若干种可以分享同一手臂。此外,每个可分享的手臂都有有限的奖赏能力和“每负”奖赏分配,两者都是学习者所不知道的。从一个可分享的手臂得到的奖赏是依靠负载的,即“每负”奖赏,它乘以拉动手臂的游戏次数,或在播放次数超过能力限度时其奖赏能力。当“每负”奖赏在高山发行之后,我们证明从依赖负载的奖赏中学习能力的能力的样本复杂性较低,而且对这个新的MP-MAB问题也有较低的约束感到遗憾。我们设计了一名能力估计者,其抽样复杂性在奖赏手段和能力方面与较低的约束方面是相匹配的。我们还提议一个在线学习算法,用以解决问题并证明其遗憾的上限。这一上限的第一个任期与低约束奖项相同,其第二和第三名词也明显与较低约束的成绩相符。

0

相关内容

赌博机/老虎机

赌博机/老虎机

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

最新《非光滑优化》十讲硬核课程，剑桥大学梁经纬博士主讲

最新《非光滑优化》十讲硬核课程，剑桥大学梁经纬博士主讲

专知会员服务

33+阅读 · 2020年8月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

[每周ArXiv] 最新几篇GNN论文

[每周ArXiv] 最新几篇GNN论文

图与推荐

0+阅读 · 2021年5月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA-VEC1340靶定KLF4在血管内皮细胞损伤中的调控及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

考虑观测值时空相关性的InSAR三维形变估计方法

国家自然科学基金

0+阅读 · 2013年12月31日

Septin7活化Ca2+/CaN/NFAT2信号途径在糖尿病肾病足细胞损伤中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

Fourier型标架与分形谱测度

国家自然科学基金

0+阅读 · 2012年12月31日

磁流体多相流交界面相互干扰及传热改善机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

随机变分不等式

国家自然科学基金

0+阅读 · 2011年12月31日

快速高灵敏度（紫外）光谱和图像测试技术与方法的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Hybrid-ARQ Based Relaying Strategies for Enhancing Reliability in Delay-Bounded Networks

Arxiv

0+阅读 · 2022年8月8日

Terahertz-Band Channel and Beam Split Estimation via Array Perturbation Model

Arxiv

0+阅读 · 2022年8月7日

Preconditioned Central Moment Lattice Boltzmann Method on a Rectangular Lattice Grid for Accelerated Computations of Inhomogeneous Flows

Arxiv

0+阅读 · 2022年8月6日

Learning with Multiple Complementary Labels

Arxiv

0+阅读 · 2022年8月6日

Learning the Trading Algorithm in Simulated Markets with Non-stationary Continuum Bandits

Arxiv

0+阅读 · 2022年8月4日

Heart rate estimation in intense exercise videos

Arxiv

0+阅读 · 2022年8月4日

ZeroFL: Efficient On-Device Training for Federated Learning with Local Sparsity

Arxiv

0+阅读 · 2022年8月4日

Adaptive Latent Factor Analysis via Generalized Momentum-Incorporated Particle Swarm Optimization

Arxiv

0+阅读 · 2022年8月4日

Risk-Aware Linear Bandits: Theory and Applications in Smart Order Routing

Arxiv

0+阅读 · 2022年8月4日

How Much Privacy Does Federated Learning with Secure Aggregation Guarantee?

Arxiv

0+阅读 · 2022年8月3日

VIP会员

文章信息

相关主题

赌博机/老虎机

样本复杂度

相关VIP内容

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

最新《非光滑优化》十讲硬核课程，剑桥大学梁经纬博士主讲

最新《非光滑优化》十讲硬核课程，剑桥大学梁经纬博士主讲

专知会员服务

33+阅读 · 2020年8月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

[每周ArXiv] 最新几篇GNN论文

[每周ArXiv] 最新几篇GNN论文

图与推荐

0+阅读 · 2021年5月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

相关论文

Hybrid-ARQ Based Relaying Strategies for Enhancing Reliability in Delay-Bounded Networks

Arxiv

0+阅读 · 2022年8月8日

Terahertz-Band Channel and Beam Split Estimation via Array Perturbation Model

Arxiv

0+阅读 · 2022年8月7日

Preconditioned Central Moment Lattice Boltzmann Method on a Rectangular Lattice Grid for Accelerated Computations of Inhomogeneous Flows

Arxiv

0+阅读 · 2022年8月6日

Learning with Multiple Complementary Labels

Arxiv

0+阅读 · 2022年8月6日

Learning the Trading Algorithm in Simulated Markets with Non-stationary Continuum Bandits

Arxiv

0+阅读 · 2022年8月4日

Heart rate estimation in intense exercise videos

Arxiv

0+阅读 · 2022年8月4日

ZeroFL: Efficient On-Device Training for Federated Learning with Local Sparsity

Arxiv

0+阅读 · 2022年8月4日

Adaptive Latent Factor Analysis via Generalized Momentum-Incorporated Particle Swarm Optimization

Arxiv

0+阅读 · 2022年8月4日

Risk-Aware Linear Bandits: Theory and Applications in Smart Order Routing

Arxiv

0+阅读 · 2022年8月4日

How Much Privacy Does Federated Learning with Secure Aggregation Guarantee?

Arxiv

0+阅读 · 2022年8月3日

相关基金

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA-VEC1340靶定KLF4在血管内皮细胞损伤中的调控及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

考虑观测值时空相关性的InSAR三维形变估计方法

国家自然科学基金

0+阅读 · 2013年12月31日

Septin7活化Ca2+/CaN/NFAT2信号途径在糖尿病肾病足细胞损伤中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

Fourier型标架与分形谱测度

国家自然科学基金

0+阅读 · 2012年12月31日

磁流体多相流交界面相互干扰及传热改善机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

随机变分不等式

国家自然科学基金

0+阅读 · 2011年12月31日

快速高灵敏度（紫外）光谱和图像测试技术与方法的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员