配有异种制剂的分布式强盗 (Distributed Bandits with Heterogeneous Agents) - 专知论文

会员服务 ·

0

赌博机/老虎机 · ARM · 优化器 · INFORMS · Performer ·

2022 年 2 月 16 日

Distributed Bandits with Heterogeneous Agents

翻译：配有异种制剂的分布式强盗

Lin Yang,Yu-zhen Janice Chen,Mohammad Hajiesmaili,John CS Lui,Don Towsley

This paper tackles a multi-agent bandit setting where $M$ agents cooperate together to solve the same instance of a $K$-armed stochastic bandit problem. The agents are \textit{heterogeneous}: each agent has limited access to a local subset of arms and the agents are asynchronous with different gaps between decision-making rounds. The goal for each agent is to find its optimal local arm, and agents can cooperate by sharing their observations with others. While cooperation between agents improves the performance of learning, it comes with an additional complexity of communication between agents. For this heterogeneous multi-agent setting, we propose two learning algorithms, \ucbo and \AAE. We prove that both algorithms achieve order-optimal regret, which is $O\left(\sum_{i:\tilde{\Delta}_i>0} \log T/\tilde{\Delta}_i\right)$, where $\tilde{\Delta}_i$ is the minimum suboptimality gap between the reward mean of arm $i$ and any local optimal arm. In addition, a careful selection of the valuable information for cooperation, \AAE achieves a low communication complexity of $O(\log T)$. Last, numerical experiments verify the efficiency of both algorithms.

翻译：本文处理一个多试剂土匪设置, 由M$代理商合作解决相同的情况。代理商是 k$- 武装的土匪问题。代理商是\ textit{ 遗传性} : 每个代理商对本地武器子集的接触有限, 代理商对决策回合之间的不同差距是一样同步的。每个代理商的目标是找到其最佳的本地手臂, 代理商可以通过与他人分享其观测结果进行合作。虽然代理商之间的合作提高了学习绩效, 但代理商之间的沟通又增加了复杂性。对于这个多试剂混合的设置, 我们建议两种学习算法, 即 \ ucbo 和\ AAE。我们证明这两种算法都实现了定序- 最佳的遗憾, 也就是 $left( sum ⁇ i:\ tilde= Delta ⁇ >0} (log T/\ tilde_ delta\ i\ right) $, 代理商可以合作。其中, $\ delta $ i 。在代理商之间的沟通中, 一个最小的亚优度差距是 $ $ $ $ 和任何本地的精度 A 精度 A 。

0

相关内容

赌博机/老虎机

赌博机/老虎机

Into the Metaverse，93页ppt介绍元宇宙概念、应用、趋势

Into the Metaverse，93页ppt介绍元宇宙概念、应用、趋势

专知会员服务

49+阅读 · 2022年2月19日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

AAAI2021 | 图神经网络的异质图结构学习，Heterogeneous Graph Structure Learning for Graph Neural Networks

专知会员服务

92+阅读 · 2021年1月20日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

专知会员服务

158+阅读 · 2020年1月29日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

基于分形与数据流挖掘技术的动态数据挖掘方法及其应用研究

国家自然科学基金

1+阅读 · 2012年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于多Agent技术的舰船综合电力系统重构研究

国家自然科学基金

4+阅读 · 2011年12月31日

含多微电网的配电网分层分布式协调控制研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于multi-agent技术的地下铀矿山安全系统动态协调机制设计与群体智能优化控制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于动态混合故障模型和进化博弈论的可生存性分析方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

多元逼近的贪婪算法与量子算法

国家自然科学基金

0+阅读 · 2009年12月31日

基于贝叶斯博弈的协同演化算法及其在交易Agent中的应用研究

国家自然科学基金

1+阅读 · 2009年12月31日

面向无纸贸易的在线支付金融和税收协同监管研究

国家自然科学基金

0+阅读 · 2009年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

Almost Optimal Algorithms for Two-player Zero-Sum Linear Mixture Markov Games

Arxiv

0+阅读 · 2022年4月20日

Can DtN and GenEO coarse spaces be sufficiently robust for heterogeneous Helmholtz problems?

Arxiv

0+阅读 · 2022年4月19日

Perceptive Mobile Network with Distributed Target Monitoring Terminals: Leaking Communication Energy for Sensing

Arxiv

0+阅读 · 2022年4月19日

FedKL: Tackling Data Heterogeneity in Federated Reinforcement Learning by Penalizing KL Divergence

Arxiv

0+阅读 · 2022年4月18日

A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits

Arxiv

0+阅读 · 2022年4月17日

Nested smoothing algorithms for inference and tracking of heterogeneous multi-scale state-space systems

Arxiv

0+阅读 · 2022年4月16日

A Distributed and Elastic Aggregation Service for Scalable Federated Learning Systems

Arxiv

0+阅读 · 2022年4月16日

A Reinforcement Learning Approach to Parameter Selection for Distributed Optimal Power Flow

Arxiv

0+阅读 · 2022年4月15日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Arxiv

12+阅读 · 2021年6月9日

Distributed Graph Convolutional Networks

Arxiv

19+阅读 · 2020年7月13日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

Into the Metaverse，93页ppt介绍元宇宙概念、应用、趋势

Into the Metaverse，93页ppt介绍元宇宙概念、应用、趋势

专知会员服务

49+阅读 · 2022年2月19日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

AAAI2021 | 图神经网络的异质图结构学习，Heterogeneous Graph Structure Learning for Graph Neural Networks

专知会员服务

92+阅读 · 2021年1月20日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

专知会员服务

158+阅读 · 2020年1月29日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Almost Optimal Algorithms for Two-player Zero-Sum Linear Mixture Markov Games

Arxiv

0+阅读 · 2022年4月20日

Can DtN and GenEO coarse spaces be sufficiently robust for heterogeneous Helmholtz problems?

Arxiv

0+阅读 · 2022年4月19日

Perceptive Mobile Network with Distributed Target Monitoring Terminals: Leaking Communication Energy for Sensing

Arxiv

0+阅读 · 2022年4月19日

FedKL: Tackling Data Heterogeneity in Federated Reinforcement Learning by Penalizing KL Divergence

Arxiv

0+阅读 · 2022年4月18日

A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits

Arxiv

0+阅读 · 2022年4月17日

Nested smoothing algorithms for inference and tracking of heterogeneous multi-scale state-space systems

Arxiv

0+阅读 · 2022年4月16日

A Distributed and Elastic Aggregation Service for Scalable Federated Learning Systems

Arxiv

0+阅读 · 2022年4月16日

A Reinforcement Learning Approach to Parameter Selection for Distributed Optimal Power Flow

Arxiv

0+阅读 · 2022年4月15日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Arxiv

12+阅读 · 2021年6月9日

Distributed Graph Convolutional Networks

Arxiv

19+阅读 · 2020年7月13日

相关基金

基于分形与数据流挖掘技术的动态数据挖掘方法及其应用研究

国家自然科学基金

1+阅读 · 2012年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于多Agent技术的舰船综合电力系统重构研究

国家自然科学基金

4+阅读 · 2011年12月31日

含多微电网的配电网分层分布式协调控制研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于multi-agent技术的地下铀矿山安全系统动态协调机制设计与群体智能优化控制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于动态混合故障模型和进化博弈论的可生存性分析方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

多元逼近的贪婪算法与量子算法

国家自然科学基金

0+阅读 · 2009年12月31日

基于贝叶斯博弈的协同演化算法及其在交易Agent中的应用研究

国家自然科学基金

1+阅读 · 2009年12月31日

面向无纸贸易的在线支付金融和税收协同监管研究

国家自然科学基金

0+阅读 · 2009年12月31日

Unscented卡尔曼滤波算法及其在通信中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员