配有上下文分发的分布式散放的牲畜行窃学习 (Distributed Stochastic Bandit Learning with Context Distributions) - 专知论文

会员服务 ·

0

赌博机/老虎机 · Agent · Performer · Learning · Extensibility ·

2022 年 7 月 28 日

Distributed Stochastic Bandit Learning with Context Distributions

翻译：配有上下文分发的分布式散放的牲畜行窃学习

Jiabin Lin,Shana Moothedath

We study the problem of distributed stochastic multi-arm contextual bandit with unknown contexts, in which M agents work collaboratively to choose optimal actions under the coordination of a central server in order to minimize the total regret. In our model, an adversary chooses a distribution on the set of possible contexts and the agents observe only the context distribution and the exact context is unknown to the agents. Such a situation arises, for instance, when the context itself is a noisy measurement or based on a prediction mechanism as in weather forecasting or stock market prediction. Our goal is to develop a distributed algorithm that selects a sequence of optimal actions to maximize the cumulative reward. By performing a feature vector transformation and by leveraging the UCB algorithm, we propose a UCB algorithm for stochastic bandits with context distribution and prove that our algorithm achieves a regret and communications bounds of $O(d\sqrt{MT}log^2T)$ and $O(M^{1.5}d^3)$, respectively, for linearly parametrized reward functions. We also consider a case where the agents observe the actual context after choosing the action. For this setting we presented a modified algorithm that utilizes the additional information to achieve a tighter regret bound. Finally, we validated the performance of our algorithms and compared it with other baseline approaches using extensive simulations on synthetic data and on the real world movielens dataset.

翻译：我们研究的是分布式随机多武器背景土匪的问题,其背景不明,M代理商在中央服务器的协调下合作选择最佳行动,以尽量减少全部遗憾。在我们的模型中,对手选择了一组可能背景的分布,代理商只观察背景分布和确切背景,例如,当环境本身是一个噪音的测量或基于天气预报或股票市场预测的预测机制,我们的目标是开发一种分布式算法,选择一系列最佳行动,以尽量扩大累积报酬。通过进行特性矢量变换和利用UCB算法,我们建议对背景分布的随机强盗采用UCB算法,并证明我们的算法取得了分别为$O(d\sqrt{MT}log}2T)和$$O(M ⁇ 1.5}d ⁇ 3)和$O(M ⁇ 1.5}d ⁇ 3)的遗憾和通信界限,用于线性准化的奖赏功能。我们还考虑一个案例,让代理商在选择行动后观察实际背景。我们用真实矢量矢量的矢量算法提出一个UC算,我们最后用更严格地核对了其他数据。

0

相关内容

赌博机/老虎机

赌博机/老虎机

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【2020新书】概率机器学习，附212页pdf与slides

【2020新书】概率机器学习，附212页pdf与slides

专知会员服务

111+阅读 · 2020年11月12日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

考虑空气可压缩特性的膜结构-空气耦合振动机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

用于治疗2型糖尿病的PPARα/γ双重激动剂的设计、合成与生物活性研究

国家自然科学基金

0+阅读 · 2014年12月31日

硫化铜矿流体包裹体组分释放及其在矿物表面吸附的机理

国家自然科学基金

0+阅读 · 2014年12月31日

不确定复杂网络的外部同步与参数辨识研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于金纳米壳粒子的准分布式光纤表面等离子体共振传感机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

平稳相依空间数据下基于经验似然的非参数统计推断

国家自然科学基金

0+阅读 · 2013年12月31日

硅介导番茄青枯病抗性的根际微生态调控机理

国家自然科学基金

0+阅读 · 2012年12月31日

大分子环肽抗生素Tyrocidine A糖基衍生物的全合成及抑菌活性研究

国家自然科学基金

0+阅读 · 2009年12月31日

新型酰胺衍生物合成与抑菌活性研究

国家自然科学基金

0+阅读 · 2009年12月31日

环境友好型高取向织构化铁电压电陶瓷的制备及机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

Learning to Counter: Stochastic Feature-based Learning for Diverse Counterfactual Explanations

Arxiv

0+阅读 · 2022年9月27日

Out-of-Distribution Representation Learning for Time Series Classification

Arxiv

0+阅读 · 2022年9月26日

Shuffle-QUDIO: accelerate distributed VQE with trainability enhancement and measurement reduction

Arxiv

0+阅读 · 2022年9月26日

Online Allocation and Learning in the Presence of Strategic Agents

Arxiv

0+阅读 · 2022年9月25日

Optimizing Class Distribution in Memory for Multi-Label Online Continual Learning

Arxiv

0+阅读 · 2022年9月23日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

Multi-pseudo Regularized Label for Generated Samples in Person Re-Identification

Arxiv

12+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【2020新书】概率机器学习，附212页pdf与slides

【2020新书】概率机器学习，附212页pdf与slides

专知会员服务

111+阅读 · 2020年11月12日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Learning to Counter: Stochastic Feature-based Learning for Diverse Counterfactual Explanations

Arxiv

0+阅读 · 2022年9月27日

Out-of-Distribution Representation Learning for Time Series Classification

Arxiv

0+阅读 · 2022年9月26日

Shuffle-QUDIO: accelerate distributed VQE with trainability enhancement and measurement reduction

Arxiv

0+阅读 · 2022年9月26日

Online Allocation and Learning in the Presence of Strategic Agents

Arxiv

0+阅读 · 2022年9月25日

Optimizing Class Distribution in Memory for Multi-Label Online Continual Learning

Arxiv

0+阅读 · 2022年9月23日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

Multi-pseudo Regularized Label for Generated Samples in Person Re-Identification

Arxiv

12+阅读 · 2018年1月29日

相关基金

考虑空气可压缩特性的膜结构-空气耦合振动机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

用于治疗2型糖尿病的PPARα/γ双重激动剂的设计、合成与生物活性研究

国家自然科学基金

0+阅读 · 2014年12月31日

硫化铜矿流体包裹体组分释放及其在矿物表面吸附的机理

国家自然科学基金

0+阅读 · 2014年12月31日

不确定复杂网络的外部同步与参数辨识研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于金纳米壳粒子的准分布式光纤表面等离子体共振传感机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

平稳相依空间数据下基于经验似然的非参数统计推断

国家自然科学基金

0+阅读 · 2013年12月31日

硅介导番茄青枯病抗性的根际微生态调控机理

国家自然科学基金

0+阅读 · 2012年12月31日

大分子环肽抗生素Tyrocidine A糖基衍生物的全合成及抑菌活性研究

国家自然科学基金

0+阅读 · 2009年12月31日

新型酰胺衍生物合成与抑菌活性研究

国家自然科学基金

0+阅读 · 2009年12月31日

环境友好型高取向织构化铁电压电陶瓷的制备及机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员