Communication-Constrained Bandits under Additive Gaussian Noise - 专知论文

会员服务 ·

0

赌博机/老虎机 · UniFormer · 估计/估计量 · 噪声 · 学习器 ·

2023 年 4 月 25 日

Communication-Constrained Bandits under Additive Gaussian Noise

翻译：暂无翻译

Prathamesh Mayekar,Jonathan Scarlett,Vincent Y. F. Tan

We study a distributed stochastic multi-armed bandit where a client supplies the learner with communication-constrained feedback based on the rewards for the corresponding arm pulls. In our setup, the client must encode the rewards such that the second moment of the encoded rewards is no more than $P$, and this encoded reward is further corrupted by additive Gaussian noise of variance $\sigma^2$; the learner only has access to this corrupted reward. For this setting, we derive an information-theoretic lower bound of $\Omega\left(\sqrt{\frac{KT}{\mathtt{SNR} \wedge1}} \right)$ on the minimax regret of any scheme, where $ \mathtt{SNR} := \frac{P}{\sigma^2}$, and $K$ and $T$ are the number of arms and time horizon, respectively. Furthermore, we propose a multi-phase bandit algorithm, $\mathtt{UE\text{-}UCB++}$, which matches this lower bound to a minor additive factor. $\mathtt{UE\text{-}UCB++}$ performs uniform exploration in its initial phases and then utilizes the {\em upper confidence bound }(UCB) bandit algorithm in its final phase. An interesting feature of $\mathtt{UE\text{-}UCB++}$ is that the coarser estimates of the mean rewards formed during a uniform exploration phase help to refine the encoding protocol in the next phase, leading to more accurate mean estimates of the rewards in the subsequent phase. This positive reinforcement cycle is critical to reducing the number of uniform exploration rounds and closely matching our lower bound.

翻译：暂无翻译

0

相关内容

赌博机/老虎机

赌博机/老虎机

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

专知

22+阅读 · 2018年4月21日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

肺炎支原体外排泵ABC Transporter在大环内酯类耐药中的作用机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

单个中性原子的操控与精密测量

国家自然科学基金

0+阅读 · 2013年12月31日

S = 1/2的J1-J2阻挫自旋链材料的基态和量子相变研究

国家自然科学基金

0+阅读 · 2013年12月31日

两类Monge-Ampere方程问题的研究

国家自然科学基金

1+阅读 · 2012年12月31日

广义Kloosterman和的均值估计

国家自然科学基金

1+阅读 · 2011年12月31日

转录因子AP-2α22312;UVB诱发皮肤癌中的作用和机制研究

国家自然科学基金

0+阅读 · 2010年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

双亲性共聚物自组装表面活性胶体粒子及其乳化性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

拟南芥VSP蛋白的晶体结构和催化特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于可加模糊行为的轮式机器人运动规划与控制

国家自然科学基金

0+阅读 · 2009年12月31日

Distributed Consensus Algorithm for Decision-Making in Multi-agent Multi-armed Bandit

Arxiv

0+阅读 · 2023年6月9日

Improved Bounds for Sampling Solutions of Random CNF Formulas

Arxiv

0+阅读 · 2023年6月9日

Decentralized Randomly Distributed Multi-agent Multi-armed Bandit with Heterogeneous Rewards

Arxiv

0+阅读 · 2023年6月8日

Recovering Simultaneously Structured Data via Non-Convex Iteratively Reweighted Least Squares

Arxiv

0+阅读 · 2023年6月8日

Robust Non-Linear Feedback Coding via Power-Constrained Deep Learning

Arxiv

0+阅读 · 2023年6月7日

Misspecification Analysis of High-Dimensional Random Effects Models for Estimation of Signal-to-Noise Ratios

Arxiv

0+阅读 · 2023年6月7日

Smooth Non-Stationary Bandits

Arxiv

0+阅读 · 2023年6月7日

Revisiting Weighted Strategy for Non-stationary Parametric Bandits

Arxiv

0+阅读 · 2023年6月7日

On the Fundamental Tradeoff of Integrated Sensing and Communications Under Gaussian Channels

Arxiv

0+阅读 · 2023年6月7日

Complexity of a Class of First-Order Objective-Function-Free Optimization Algorithms

Arxiv

0+阅读 · 2023年6月6日

VIP会员

文章信息

相关主题

赌博机/老虎机

估计/估计量

相关VIP内容

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《在单一作战合成环境（SSE）中运用人工智能与大型语言模型以提供灵活人文地形及可信角色组》报告

《俄罗斯的未来战争方式第二部分：核威慑》报告

《提示战争：大语言模型如何决定军事干预》报告

《俄罗斯的未来战争方式第三部分：军事改革》报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

【论文推荐】最新五篇视频分类相关论文—细粒度行人识别、群组归一化、MLtuner、时序特征

专知

22+阅读 · 2018年4月21日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Distributed Consensus Algorithm for Decision-Making in Multi-agent Multi-armed Bandit

Arxiv

0+阅读 · 2023年6月9日

Improved Bounds for Sampling Solutions of Random CNF Formulas

Arxiv

0+阅读 · 2023年6月9日

Decentralized Randomly Distributed Multi-agent Multi-armed Bandit with Heterogeneous Rewards

Arxiv

0+阅读 · 2023年6月8日

Recovering Simultaneously Structured Data via Non-Convex Iteratively Reweighted Least Squares

Arxiv

0+阅读 · 2023年6月8日

Robust Non-Linear Feedback Coding via Power-Constrained Deep Learning

Arxiv

0+阅读 · 2023年6月7日

Misspecification Analysis of High-Dimensional Random Effects Models for Estimation of Signal-to-Noise Ratios

Arxiv

0+阅读 · 2023年6月7日

Smooth Non-Stationary Bandits

Arxiv

0+阅读 · 2023年6月7日

Revisiting Weighted Strategy for Non-stationary Parametric Bandits

Arxiv

0+阅读 · 2023年6月7日

On the Fundamental Tradeoff of Integrated Sensing and Communications Under Gaussian Channels

Arxiv

0+阅读 · 2023年6月7日

Complexity of a Class of First-Order Objective-Function-Free Optimization Algorithms

Arxiv

0+阅读 · 2023年6月6日

相关基金

肺炎支原体外排泵ABC Transporter在大环内酯类耐药中的作用机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

单个中性原子的操控与精密测量

国家自然科学基金

0+阅读 · 2013年12月31日

S = 1/2的J1-J2阻挫自旋链材料的基态和量子相变研究

国家自然科学基金

0+阅读 · 2013年12月31日

两类Monge-Ampere方程问题的研究

国家自然科学基金

1+阅读 · 2012年12月31日

广义Kloosterman和的均值估计

国家自然科学基金

1+阅读 · 2011年12月31日

转录因子AP-2α22312;UVB诱发皮肤癌中的作用和机制研究

国家自然科学基金

0+阅读 · 2010年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

双亲性共聚物自组装表面活性胶体粒子及其乳化性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

拟南芥VSP蛋白的晶体结构和催化特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于可加模糊行为的轮式机器人运动规划与控制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员