Constant or logarithmic regret in asynchronous multiplayer bandits - 专知论文

会员服务 ·

0

赌博机/老虎机 · Extensibility · 优化器 · 贪心 · ARM ·

2023 年 5 月 31 日

Constant or logarithmic regret in asynchronous multiplayer bandits

翻译：暂无翻译

Hugo Richard,Etienne Boursier,Vianney Perchet

Multiplayer bandits have recently been extensively studied because of their application to cognitive radio networks. While the literature mostly considers synchronous players, radio networks (e.g. for IoT) tend to have asynchronous devices. This motivates the harder, asynchronous multiplayer bandits problem, which was first tackled with an explore-then-commit (ETC) algorithm (see Dakdouk, 2022), with a regret upper-bound in $\mathcal{O}(T^{\frac{2}{3}})$. Before even considering decentralization, understanding the centralized case was still a challenge as it was unknown whether getting a regret smaller than $\Omega(T^{\frac{2}{3}})$ was possible. We answer positively this question, as a natural extension of UCB exhibits a $\mathcal{O}(\sqrt{T\log(T)})$ minimax regret. More importantly, we introduce Cautious Greedy, a centralized algorithm that yields constant instance-dependent regret if the optimal policy assigns at least one player on each arm (a situation that is proved to occur when arm means are close enough). Otherwise, its regret increases as the sum of $\log(T)$ over some sub-optimality gaps. We provide lower bounds showing that Cautious Greedy is optimal in the data-dependent terms. Therefore, we set up a strong baseline for asynchronous multiplayer bandits and suggest that learning the optimal policy in this problem might be easier than thought, at least with centralization.

翻译：暂无翻译

0

相关内容

赌博机/老虎机

赌博机/老虎机

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

Notch与HIF-1信号转导通路cross-talk在脑海绵状血管瘤血管新生中作用

国家自然科学基金

0+阅读 · 2013年12月31日

PDCD5基因调控序列病理性高甲基化的分子机制及其在寻常型银屑病发病中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

Antrum mucosa protein-18（AMP-18）参与胃粘膜上皮细胞癌变的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

针刺对大鼠颈椎病模型髓核细胞FAK-MAPK信号通路影响机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Mean-field Approximations for Stochastic Population Processes with Heterogeneous Interactions

Arxiv

0+阅读 · 2023年7月19日

Near-Linear Time Projection onto the $\ell_{1,\infty}$ Ball; Application to Sparse Autoencoders

Arxiv

0+阅读 · 2023年7月19日

Asynchronous Multiparty Session Type Implementability is Decidable -- Lessons Learned from Message Sequence Charts

Arxiv

0+阅读 · 2023年7月18日

Diversity-seeking Jump Games in Networks

Arxiv

0+阅读 · 2023年7月17日

Multi-Player Zero-Sum Markov Games with Networked Separable Interactions

Arxiv

0+阅读 · 2023年7月13日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《利用人工智能对军事行动进行建模》

《利用人工智能学习、优化与推演美国海军作战部队的战略布局与分散（续文）》

机器人、无人机与实时影像：应对城市爆炸威胁的三大技术方案

《指挥官意图消息中关键概念自动提取》最新47页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Mean-field Approximations for Stochastic Population Processes with Heterogeneous Interactions

Arxiv

0+阅读 · 2023年7月19日

Near-Linear Time Projection onto the $\ell_{1,\infty}$ Ball; Application to Sparse Autoencoders

Arxiv

0+阅读 · 2023年7月19日

Asynchronous Multiparty Session Type Implementability is Decidable -- Lessons Learned from Message Sequence Charts

Arxiv

0+阅读 · 2023年7月18日

Diversity-seeking Jump Games in Networks

Arxiv

0+阅读 · 2023年7月17日

Multi-Player Zero-Sum Markov Games with Networked Separable Interactions

Arxiv

0+阅读 · 2023年7月13日

相关基金

Notch与HIF-1信号转导通路cross-talk在脑海绵状血管瘤血管新生中作用

国家自然科学基金

0+阅读 · 2013年12月31日

PDCD5基因调控序列病理性高甲基化的分子机制及其在寻常型银屑病发病中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

Antrum mucosa protein-18（AMP-18）参与胃粘膜上皮细胞癌变的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

针刺对大鼠颈椎病模型髓核细胞FAK-MAPK信号通路影响机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员