拥有有限可分享资源武器的多层次多武装强盗:学习算法和应用 (Multi-Player Multi-Armed Bandits with Finite Shareable Resources Arms: Learning Algorithms & Applications) - 专知论文

会员服务 ·

0

赌博机/老虎机 · ARM · Extensibility · Wireless Networks · 学成 ·

2022 年 4 月 28 日

Multi-Player Multi-Armed Bandits with Finite Shareable Resources Arms: Learning Algorithms & Applications

翻译：拥有有限可分享资源武器的多层次多武装强盗:学习算法和应用

Xuchuang Wang,Hong Xie,John C. S. Lui

from arxiv, To appear at IJCAI 22

Multi-player multi-armed bandits (MMAB) study how decentralized players cooperatively play the same multi-armed bandit so as to maximize their total cumulative rewards. Existing MMAB models mostly assume when more than one player pulls the same arm, they either have a collision and obtain zero rewards, or have no collision and gain independent rewards, both of which are usually too restrictive in practical scenarios. In this paper, we propose an MMAB with shareable resources as an extension to the collision and non-collision settings. Each shareable arm has finite shareable resources and a "per-load" reward random variable, both of which are unknown to players. The reward from a shareable arm is equal to the "per-load" reward multiplied by the minimum between the number of players pulling the arm and the arm's maximal shareable resources. We consider two types of feedback: sharing demand information (SDI) and sharing demand awareness (SDA), each of which provides different signals of resource sharing. We design the DPE-SDI and SIC-SDA algorithms to address the shareable arm problem under these two cases of feedback respectively and prove that both algorithms have logarithmic regrets that are tight in the number of rounds. We conduct simulations to validate both algorithms' performance and show their utilities in wireless networking and edge computing.

翻译：多玩家多武装匪徒(MMAB)研究分散派的玩家如何合作玩同一个多武装匪徒,以便最大限度地增加其累积的奖励。现有的MMAB模式大多假设,当不止一个玩家拉起同一个手臂时,他们要么碰撞并获得零奖励,要么没有碰撞并获得独立奖赏,两者在实际情景中通常都限制过大。在本文中,我们建议采用一个拥有共享资源的MMAB,将共享资源扩展至碰撞和非闭合环境。每个共享的手臂都有有限的可分享资源和“每载”奖赏随机变量,两者对玩家来说都是未知的。从一个共享的手臂获得的奖励等于“每载”奖赏,乘以拉动手臂的玩家人数和该臂的最大共享资源之间的最低数额。我们考虑两种反馈类型:分享需求信息,分享需求意识(SDADA),其中每一种都提供不同的资源共享信号。我们设计了DPE-SDI和SIC-SD的算法,以便解决这两个反馈案例下的可分享的可分享武器问题。

0

相关内容

赌博机/老虎机

赌博机/老虎机

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Tet3介导的表观遗传修饰对CD4+T细胞分化及Th17细胞功能的调控

国家自然科学基金

0+阅读 · 2016年12月31日

非凸稀疏正则化模型与算法的研究

国家自然科学基金

3+阅读 · 2015年12月31日

李超代数中若干问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

共生菌调控肺脏γδT细胞抗肿瘤免疫应答的细胞与分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

受限制策略下多臂Bandit过程的理论与应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

半参数回归分析的随机函数法及其高维情形

国家自然科学基金

2+阅读 · 2012年12月31日

ITS中基于有向超图的个性化的学习过程及其支持资源的优化

国家自然科学基金

0+阅读 · 2012年12月31日

玉米种子老化的表观遗传机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

E-cadherin阳性树突状细胞在非小细胞肺癌免疫微环境中的作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

通信系统中并行多信道ARQ协议的随机模型及其性能分析

国家自然科学基金

0+阅读 · 2011年12月31日

On Private Online Convex Optimization: Optimal Algorithms in $\ell_p$-Geometry and High Dimensional Contextual Bandits

Arxiv

0+阅读 · 2022年6月16日

Beamforming in Integrated Sensing and Communication Systems with Reconfigurable Intelligent Surfaces

Arxiv

0+阅读 · 2022年6月15日

Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds

Arxiv

0+阅读 · 2022年6月14日

Symbolic Regression for Space Applications: Differentiable Cartesian Genetic Programming Powered by Multi-objective Memetic Algorithms

Symbolic Regression for Space Applications: Differentiable Cartesian Genetic Programming Powered by Multi-objective Memetic Algorithms

Arxiv

0+阅读 · 2022年6月13日

A Directed-Evolution Method for Sparsification and Compression of Neural Networks with Application to Object Identification and Segmentation and considerations of optimal quantization using small number of bits

Arxiv

0+阅读 · 2022年6月12日

Adaptivity and Confounding in Multi-Armed Bandit Experiments

Arxiv

0+阅读 · 2022年6月12日

Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits

Arxiv

0+阅读 · 2022年6月12日

Deterministic Algorithms for the Hidden Subgroup Problem

Arxiv

0+阅读 · 2022年6月10日

Deep Leakage from Model in Federated Learning

Arxiv

0+阅读 · 2022年6月10日

Joint Resource Allocation to Minimize Execution Time of Federated Learning in Cell-Free Massive MIMO

Arxiv

0+阅读 · 2022年6月10日

VIP会员

文章信息

相关主题

赌博机/老虎机

Wireless Networks

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

On Private Online Convex Optimization: Optimal Algorithms in $\ell_p$-Geometry and High Dimensional Contextual Bandits

Arxiv

0+阅读 · 2022年6月16日

Beamforming in Integrated Sensing and Communication Systems with Reconfigurable Intelligent Surfaces

Arxiv

0+阅读 · 2022年6月15日

Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds

Arxiv

0+阅读 · 2022年6月14日

Symbolic Regression for Space Applications: Differentiable Cartesian Genetic Programming Powered by Multi-objective Memetic Algorithms

Symbolic Regression for Space Applications: Differentiable Cartesian Genetic Programming Powered by Multi-objective Memetic Algorithms

Arxiv

0+阅读 · 2022年6月13日

A Directed-Evolution Method for Sparsification and Compression of Neural Networks with Application to Object Identification and Segmentation and considerations of optimal quantization using small number of bits

Arxiv

0+阅读 · 2022年6月12日

Adaptivity and Confounding in Multi-Armed Bandit Experiments

Arxiv

0+阅读 · 2022年6月12日

Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits

Arxiv

0+阅读 · 2022年6月12日

Deterministic Algorithms for the Hidden Subgroup Problem

Arxiv

0+阅读 · 2022年6月10日

Deep Leakage from Model in Federated Learning

Arxiv

0+阅读 · 2022年6月10日

Joint Resource Allocation to Minimize Execution Time of Federated Learning in Cell-Free Massive MIMO

Arxiv

0+阅读 · 2022年6月10日

相关基金

Tet3介导的表观遗传修饰对CD4+T细胞分化及Th17细胞功能的调控

国家自然科学基金

0+阅读 · 2016年12月31日

非凸稀疏正则化模型与算法的研究

国家自然科学基金

3+阅读 · 2015年12月31日

李超代数中若干问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

共生菌调控肺脏γδT细胞抗肿瘤免疫应答的细胞与分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

受限制策略下多臂Bandit过程的理论与应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

半参数回归分析的随机函数法及其高维情形

国家自然科学基金

2+阅读 · 2012年12月31日

ITS中基于有向超图的个性化的学习过程及其支持资源的优化

国家自然科学基金

0+阅读 · 2012年12月31日

玉米种子老化的表观遗传机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

E-cadherin阳性树突状细胞在非小细胞肺癌免疫微环境中的作用及机制

国家自然科学基金

0+阅读 · 2011年12月31日

通信系统中并行多信道ARQ协议的随机模型及其性能分析

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员