非同步多权力多权力强盗的即时通信 (On-Demand Communication for Asynchronous Multi-Agent Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · Agent · Extensibility · 对抗自编码 · INFORMS ·

2023 年 2 月 15 日

On-Demand Communication for Asynchronous Multi-Agent Bandits

翻译：非同步多权力多权力强盗的即时通信

Yu-Zhen Janice Chen,Lin Yang,Xuchuang Wang,Xutong Liu,Mohammad Hajiesmaili,John C. S. Lui,Don Towsley

from arxiv, Accepted by AISTATS 2023

This paper studies a cooperative multi-agent multi-armed stochastic bandit problem where agents operate asynchronously -- agent pull times and rates are unknown, irregular, and heterogeneous -- and face the same instance of a K-armed bandit problem. Agents can share reward information to speed up the learning process at additional communication costs. We propose ODC, an on-demand communication protocol that tailors the communication of each pair of agents based on their empirical pull times. ODC is efficient when the pull times of agents are highly heterogeneous, and its communication complexity depends on the empirical pull times of agents. ODC is a generic protocol that can be integrated into most cooperative bandit algorithms without degrading their performance. We then incorporate ODC into the natural extensions of UCB and AAE algorithms and propose two communication-efficient cooperative algorithms. Our analysis shows that both algorithms are near-optimal in regret.

翻译：本文研究一个合作性多试剂多臂多臂强盗问题,即代理人操作时速和速度不明、不规则、不规则、不均匀,并面临K型武装强盗问题的相同实例。代理人可以分享奖励信息,以额外的通信成本加快学习过程。我们提议ODC,即一个按需通信协议,根据经验拉动时间调整每对代理人的通信。当代理人的拉动时间高度不一时,ODC效率很高,其通信复杂性取决于代理人的经验拉动时间。ODC是一个通用协议,可以纳入大多数合作强盗算法,而不会降低他们的表现。我们随后将ODC纳入UB和AAE算法的自然扩展,并提出两种通信效率高的合作算法。我们的分析表明,这两种算法几乎都是最理想的。

0

相关内容

赌博机/老虎机

赌博机/老虎机

《火力支援科技》美陆军14页slides

《火力支援科技》美陆军14页slides

专知会员服务

43+阅读 · 2022年11月29日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

资源｜斯坦福课程：深度学习理论！

资源｜斯坦福课程：深度学习理论！

全球人工智能

17+阅读 · 2017年11月9日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

III型AtOFPs转录因子对拟南芥荚果形态的调控

国家自然科学基金

0+阅读 · 2014年12月31日

基于SCs促进OPCs存活、增殖和迁移的机制探讨SCI治疗的新策略

国家自然科学基金

0+阅读 · 2014年12月31日

茉莉素调控水稻花分生组织发育分子机制的研究

国家自然科学基金

0+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

苹果转录因子MdWRKY33与MdVQ相互作用抗炭疽叶枯病的分子机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

理性主体博弈的逻辑建模及其模型检测

国家自然科学基金

2+阅读 · 2011年12月31日

甘草素（liquiritigenin）抗肝肿瘤作用及其氧化应激机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

DCC在斑马鱼前脑神经元早期极化中的作用及其分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

固体材料及薄膜的若干非线性物理现象的数值计算研究

国家自然科学基金

0+阅读 · 2008年12月31日

Asynchronous Federated Continual Learning

Arxiv

1+阅读 · 2023年4月7日

Agent swarms: cooperation and coordination under stringent communications constraint

Arxiv

0+阅读 · 2023年4月6日

Reactive Task Allocation for Balanced Servicing of Multiple Task Queues

Arxiv

0+阅读 · 2023年4月5日

Dynamic Adversarial Resource Allocation: the dDAB Game

Arxiv

0+阅读 · 2023年4月5日

Asynchronous Load Balancing and Auto-scaling: Mean-Field Limit and Optimal Design

Arxiv

0+阅读 · 2023年4月4日

Async-HFL: Efficient and Robust Asynchronous Federated Learning in Hierarchical IoT Networks

Arxiv

0+阅读 · 2023年4月4日

Communication-Efficient Federated Linear and Deep Generalized Canonical Correlation Analysis

Arxiv

0+阅读 · 2023年4月3日

Convergence of Batch Asynchronous Stochastic Approximation With Applications to Reinforcement Learning

Arxiv

0+阅读 · 2023年4月3日

Capacity Region of Asynchronous Multiple Access Channels with FTN

Arxiv

0+阅读 · 2023年4月1日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Arxiv

40+阅读 · 2021年9月15日

VIP会员

文章信息

相关主题

赌博机/老虎机

对抗自编码

相关VIP内容

《火力支援科技》美陆军14页slides

《火力支援科技》美陆军14页slides

专知会员服务

43+阅读 · 2022年11月29日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】以人为中心的强化学习

任务规划与地形分析：现代复杂环境作战导航体系

认知优势：人工智能在国家安全决策中的核心作用

大模型赋能的具身智能：决策与具身学习综述

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

资源｜斯坦福课程：深度学习理论！

资源｜斯坦福课程：深度学习理论！

全球人工智能

17+阅读 · 2017年11月9日

相关论文

Asynchronous Federated Continual Learning

Arxiv

1+阅读 · 2023年4月7日

Agent swarms: cooperation and coordination under stringent communications constraint

Arxiv

0+阅读 · 2023年4月6日

Reactive Task Allocation for Balanced Servicing of Multiple Task Queues

Arxiv

0+阅读 · 2023年4月5日

Dynamic Adversarial Resource Allocation: the dDAB Game

Arxiv

0+阅读 · 2023年4月5日

Asynchronous Load Balancing and Auto-scaling: Mean-Field Limit and Optimal Design

Arxiv

0+阅读 · 2023年4月4日

Async-HFL: Efficient and Robust Asynchronous Federated Learning in Hierarchical IoT Networks

Arxiv

0+阅读 · 2023年4月4日

Communication-Efficient Federated Linear and Deep Generalized Canonical Correlation Analysis

Arxiv

0+阅读 · 2023年4月3日

Convergence of Batch Asynchronous Stochastic Approximation With Applications to Reinforcement Learning

Arxiv

0+阅读 · 2023年4月3日

Capacity Region of Asynchronous Multiple Access Channels with FTN

Arxiv

0+阅读 · 2023年4月1日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Arxiv

40+阅读 · 2021年9月15日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

III型AtOFPs转录因子对拟南芥荚果形态的调控

国家自然科学基金

0+阅读 · 2014年12月31日

基于SCs促进OPCs存活、增殖和迁移的机制探讨SCI治疗的新策略

国家自然科学基金

0+阅读 · 2014年12月31日

茉莉素调控水稻花分生组织发育分子机制的研究

国家自然科学基金

0+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

苹果转录因子MdWRKY33与MdVQ相互作用抗炭疽叶枯病的分子机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

理性主体博弈的逻辑建模及其模型检测

国家自然科学基金

2+阅读 · 2011年12月31日

甘草素（liquiritigenin）抗肝肿瘤作用及其氧化应激机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

DCC在斑马鱼前脑神经元早期极化中的作用及其分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

固体材料及薄膜的若干非线性物理现象的数值计算研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员