《在强盗中学习先行者》不容遗憾 (No Regrets for Learning the Prior in Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · INTERACT · 学成 · 边缘化 · 粤港澳大湾区数字经济研究院 ·

2021 年 7 月 13 日

No Regrets for Learning the Prior in Bandits

翻译：《在强盗中学习先行者》不容遗憾

Soumya Basu,Branislav Kveton,Manzil Zaheer,Csaba Szepesvári

We propose ${\tt AdaTS}$, a Thompson sampling algorithm that adapts sequentially to bandit tasks that it interacts with. The key idea in ${\tt AdaTS}$ is to adapt to an unknown task prior distribution by maintaining a distribution over its parameters. When solving a bandit task, that uncertainty is marginalized out and properly accounted for. ${\tt AdaTS}$ is a fully-Bayesian algorithm that can be implemented efficiently in several classes of bandit problems. We derive upper bounds on its Bayes regret that quantify the loss due to not knowing the task prior, and show that it is small. Our theory is supported by experiments, where ${\tt AdaTS}$ outperforms prior algorithms and works well even in challenging real-world problems.

翻译：我们提出美元,这是汤普森抽样算法,可以按顺序适应它与之互动的土匪任务。关键的想法是,通过维持对参数的分布来适应一个未知的任务先前的分配。当解决土匪任务时,这种不确定性被排挤出来,并适当说明原因。美元是完全Bayesian算法,可以在几类土匪问题中有效实施。我们从贝斯获得上界的遗憾,因为它不知道之前的任务,而量化了损失,并表明它是很小的。我们的理论得到了实验的支持,在实验中,美元优于先前的算法,甚至在挑战现实世界的问题中也运作良好。

0

相关内容

赌博机/老虎机

赌博机/老虎机

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

专知会员服务

77+阅读 · 2020年6月28日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

因果关联学习，Causal Relational Learning

因果关联学习，Causal Relational Learning

专知会员服务

185+阅读 · 2020年4月21日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

TensorFlow深度学习，从线性回归到强化学习的深度学习（TensorFlow for Deep Learning From Linear Regression to Reinforcement Learning），附页256页pdf

TensorFlow深度学习，从线性回归到强化学习的深度学习（TensorFlow for Deep Learning From Linear Regression to Reinforcement Learning），附页256页pdf

专知会员服务

46+阅读 · 2020年1月1日

【健康医疗中的机器学习算法综述】A Survey Of Machine Learning Algorithms In Health Care

【健康医疗中的机器学习算法综述】A Survey Of Machine Learning Algorithms In Health Care

专知会员服务

14+阅读 · 2019年11月19日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

已删除

将门创投

6+阅读 · 2019年9月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Machine Learning for Online Algorithm Selection under Censored Feedback

Arxiv

0+阅读 · 2021年9月13日

Lenient Regret for Multi-Armed Bandits

Arxiv

0+阅读 · 2021年9月12日

Differentially Private Federated Learning with Laplacian Smoothing

Arxiv

0+阅读 · 2021年9月10日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

Arxiv

5+阅读 · 2020年7月31日

No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium

Arxiv

4+阅读 · 2020年6月20日

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Arxiv

3+阅读 · 2019年6月20日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Modeling Others using Oneself in Multi-Agent Reinforcement Learning

Arxiv

4+阅读 · 2018年3月22日

Together or Alone: The Price of Privacy in Collaborative Learning

Arxiv

4+阅读 · 2018年2月28日

VIP会员

文章信息

相关主题

赌博机/老虎机

粤港澳大湾区数字经济研究院

相关VIP内容

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

专知会员服务

77+阅读 · 2020年6月28日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

因果关联学习，Causal Relational Learning

因果关联学习，Causal Relational Learning

专知会员服务

185+阅读 · 2020年4月21日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

TensorFlow深度学习，从线性回归到强化学习的深度学习（TensorFlow for Deep Learning From Linear Regression to Reinforcement Learning），附页256页pdf

TensorFlow深度学习，从线性回归到强化学习的深度学习（TensorFlow for Deep Learning From Linear Regression to Reinforcement Learning），附页256页pdf

专知会员服务

46+阅读 · 2020年1月1日

【健康医疗中的机器学习算法综述】A Survey Of Machine Learning Algorithms In Health Care

【健康医疗中的机器学习算法综述】A Survey Of Machine Learning Algorithms In Health Care

专知会员服务

14+阅读 · 2019年11月19日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型基准综述

《自适应训练辅助系统概念导论及其在空战指挥官加速培训中的应用》125页

【剑桥博士论文】多智能体学习中的神经多样性

以色列-伊朗空战：短暂而激烈冲突的启示

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

已删除

将门创投

6+阅读 · 2019年9月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Machine Learning for Online Algorithm Selection under Censored Feedback

Arxiv

0+阅读 · 2021年9月13日

Lenient Regret for Multi-Armed Bandits

Arxiv

0+阅读 · 2021年9月12日

Differentially Private Federated Learning with Laplacian Smoothing

Arxiv

0+阅读 · 2021年9月10日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

Arxiv

5+阅读 · 2020年7月31日

No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium

Arxiv

4+阅读 · 2020年6月20日

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Arxiv

3+阅读 · 2019年6月20日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Modeling Others using Oneself in Multi-Agent Reinforcement Learning

Arxiv

4+阅读 · 2018年3月22日

Together or Alone: The Price of Privacy in Collaborative Learning

Arxiv

4+阅读 · 2018年2月28日

微信扫码咨询专知VIP会员