通过预测抽样进行非静止的土匪学习 (Nonstationary Bandit Learning via Predictive Sampling) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 样本 · 回合 · 学成 · INFORMS ·

2022 年 5 月 4 日

Nonstationary Bandit Learning via Predictive Sampling

翻译：通过预测抽样进行非静止的土匪学习

Yueyang Liu,Benjamin Van Roy,Kuang Xu

We propose predictive sampling as an approach to selecting actions that balance between exploration and exploitation in nonstationary bandit environments. When specialized to stationary environments, predictive sampling is equivalent to Thompson sampling. However, predictive sampling is effective across a range of nonstationary environments in which Thompson sampling suffers. We establish a general information-theoretic bound on the Bayesian regret of predictive sampling. We then specialize this bound to study a modulated Bernoulli bandit environment. Our analysis highlights a key advantage of predictive sampling over Thompson sampling: predictive sampling deprioritizes investments in exploration where acquired information will quickly become less relevant.

翻译：我们提出预测抽样,作为选择在非静止强盗环境中进行勘探和开发之间平衡的行动的一种方法。当专门为固定环境进行预测抽样时,预测抽样相当于汤普森取样。然而,预测抽样在汤普森取样所受影响的一系列非静止环境中是有效的。我们在贝叶斯人对预测抽样的遗憾上建立了一个一般的信息理论约束。然后我们专门研究一个调制的伯努利强盗环境。我们的分析突出了预测抽样相对于汤普森取样的主要优势:预测抽样使对勘探的投资失去优先地位,而获得的信息将很快变得不太相关。

0

相关内容

赌博机/老虎机

赌博机/老虎机

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

图机器学习导论，69页ppt，An introduction to machine learning on graphs

图机器学习导论，69页ppt，An introduction to machine learning on graphs

专知会员服务

382+阅读 · 2019年12月27日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

长链非编码RNA n385229吸附miR-497对胰腺癌化疗耐药表型的调控作用

国家自然科学基金

0+阅读 · 2015年12月31日

基于CASSINI卫星观测的土星辐射带粒子动力学过程研究

国家自然科学基金

0+阅读 · 2014年12月31日

地聚合物/水化硅酸钙体系的微结构形成机理与过程调控

国家自然科学基金

0+阅读 · 2014年12月31日

北太平洋冬季天气尺度涡旋对大气低频变化的作用

国家自然科学基金

0+阅读 · 2014年12月31日

黎曼流形上椭圆算子的谱估计

国家自然科学基金

0+阅读 · 2013年12月31日

纳米复合镁基储氢材料热力学及动力学调控

国家自然科学基金

0+阅读 · 2012年12月31日

CITED2在心脏干细胞衰老中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

基于梯度Kriging方法的轮胎花纹形状优化

国家自然科学基金

0+阅读 · 2011年12月31日

含酰亚胺结构耐高温聚酯弹性体的制备及性能

国家自然科学基金

0+阅读 · 2009年12月31日

Adaptive Regularized Zero-Forcing Beamforming in Massive MIMO with Multi-Antenna Users

Adaptive Regularized Zero-Forcing Beamforming in Massive MIMO with Multi-Antenna Users

Arxiv

0+阅读 · 2022年6月22日

Low depth algorithms for quantum amplitude estimation

Arxiv

0+阅读 · 2022年6月22日

Machine learning to assess relatedness: the advantage of using firm-level data

Arxiv

0+阅读 · 2022年6月21日

SS-IL: Separated Softmax for Incremental Learning

Arxiv

0+阅读 · 2022年6月21日

Sampling Efficient Deep Reinforcement Learning through Preference-Guided Stochastic Exploration

Arxiv

0+阅读 · 2022年6月20日

Robust One Round Federated Learning with Predictive Space Bayesian Inference

Arxiv

0+阅读 · 2022年6月20日

Generalized Data Distribution Iteration

Arxiv

0+阅读 · 2022年6月17日

A Comparative Study for Unsupervised Network Representation Learning

Arxiv

24+阅读 · 2020年3月11日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Zero-Shot Object Detection by Hybrid Region Embedding

Arxiv

19+阅读 · 2018年5月17日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

图机器学习导论，69页ppt，An introduction to machine learning on graphs

图机器学习导论，69页ppt，An introduction to machine learning on graphs

专知会员服务

382+阅读 · 2019年12月27日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Adaptive Regularized Zero-Forcing Beamforming in Massive MIMO with Multi-Antenna Users

Adaptive Regularized Zero-Forcing Beamforming in Massive MIMO with Multi-Antenna Users

Arxiv

0+阅读 · 2022年6月22日

Low depth algorithms for quantum amplitude estimation

Arxiv

0+阅读 · 2022年6月22日

Machine learning to assess relatedness: the advantage of using firm-level data

Arxiv

0+阅读 · 2022年6月21日

SS-IL: Separated Softmax for Incremental Learning

Arxiv

0+阅读 · 2022年6月21日

Sampling Efficient Deep Reinforcement Learning through Preference-Guided Stochastic Exploration

Arxiv

0+阅读 · 2022年6月20日

Robust One Round Federated Learning with Predictive Space Bayesian Inference

Arxiv

0+阅读 · 2022年6月20日

Generalized Data Distribution Iteration

Arxiv

0+阅读 · 2022年6月17日

A Comparative Study for Unsupervised Network Representation Learning

Arxiv

24+阅读 · 2020年3月11日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Zero-Shot Object Detection by Hybrid Region Embedding

Arxiv

19+阅读 · 2018年5月17日

相关基金

长链非编码RNA n385229吸附miR-497对胰腺癌化疗耐药表型的调控作用

国家自然科学基金

0+阅读 · 2015年12月31日

基于CASSINI卫星观测的土星辐射带粒子动力学过程研究

国家自然科学基金

0+阅读 · 2014年12月31日

地聚合物/水化硅酸钙体系的微结构形成机理与过程调控

国家自然科学基金

0+阅读 · 2014年12月31日

北太平洋冬季天气尺度涡旋对大气低频变化的作用

国家自然科学基金

0+阅读 · 2014年12月31日

黎曼流形上椭圆算子的谱估计

国家自然科学基金

0+阅读 · 2013年12月31日

纳米复合镁基储氢材料热力学及动力学调控

国家自然科学基金

0+阅读 · 2012年12月31日

CITED2在心脏干细胞衰老中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

基于梯度Kriging方法的轮胎花纹形状优化

国家自然科学基金

0+阅读 · 2011年12月31日

含酰亚胺结构耐高温聚酯弹性体的制备及性能

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员