Gaussian 强盗最佳武器识别非被动方法 (A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 单位方差 · 样本复杂度 · 样本 · 正则化项 ·

2021 年 5 月 27 日

A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits

翻译：Gaussian 强盗最佳武器识别非被动方法

Antoine Barrier,Aurélien Garivier,Tomáš Kocák

We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance. This strategy called Exploration-Biased Sampling is not only asymptotically optimal: we also prove non-asymptotic bounds occurring with high probability. To the best of our knowledge, this is the first strategy with such guarantees. But the main advantage over other algorithms like Track-and-Stop is an improved behavior regarding exploration: Exploration-Biased Sampling is slightly biased in favor of exploration in a subtle but natural way that makes it more stable and interpretable. These improvements are allowed by a new analysis of the sample complexity optimization problem, which yields a faster numerical resolution scheme and several quantitative regularity results that we believe of high independent interest.

翻译：我们提出了一个新战略,以固定的信心,对高山变量进行最佳武器识别,并有封闭手段和单位差异。这个战略称为“勘探-比亚抽样抽样调查”,不仅无处不在,而且非常理想:我们还证明,非非非抽样界限发生的可能性很大。据我们所知,这是第一个有这种保证的战略。但相对于其他算法,例如“追踪和停止”的主要优势是改进了勘探行为:探索-比亚抽样抽样调查略有偏向于以微妙但自然的方式进行勘探,从而使得其更加稳定和易于解释。通过对抽样复杂性优化问题进行新的分析,这些改进是允许的,因为通过分析可以产生一个更快的数字解决方案和一些我们认为具有高度独立利益的定量定期性结果。

0

相关内容

赌博机/老虎机

赌博机/老虎机

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

【干货书】鲁棒优化Robust Optimization，570页pdf

专知会员服务

144+阅读 · 2021年3月17日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

计算机视觉最佳实践、代码示例和相关文档

计算机视觉最佳实践、代码示例和相关文档

专知会员服务

20+阅读 · 2019年10月9日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

OpenAI丨深度强化学习关键论文列表

OpenAI丨深度强化学习关键论文列表

中国人工智能学会

17+阅读 · 2018年11月10日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

EPTAS for stable allocations in matching games

Arxiv

0+阅读 · 2021年7月20日

The complexity of approximating averages on bounded-degree graphs

Arxiv

0+阅读 · 2021年7月19日

Polynomial Time Algorithms to Find an Approximate Competitive Equilibrium for Chores

Arxiv

0+阅读 · 2021年7月17日

On Efficient Optimal Transport: An Analysis of Greedy and Accelerated Mirror Descent Algorithms

Arxiv

0+阅读 · 2021年7月17日

Data-informed Deep Optimization

Arxiv

0+阅读 · 2021年7月17日

On the stability of robust dynamical low-rank approximations for hyperbolic problems

On the stability of robust dynamical low-rank approximations for hyperbolic problems

Arxiv

0+阅读 · 2021年7月15日

Diffusion Approximations for Thompson Sampling

Arxiv

0+阅读 · 2021年7月15日

Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning

Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning

Arxiv

9+阅读 · 2021年2月23日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

VIP会员

文章信息

相关主题

赌博机/老虎机

样本复杂度

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

【干货书】鲁棒优化Robust Optimization，570页pdf

专知会员服务

144+阅读 · 2021年3月17日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

计算机视觉最佳实践、代码示例和相关文档

计算机视觉最佳实践、代码示例和相关文档

专知会员服务

20+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

OpenAI丨深度强化学习关键论文列表

OpenAI丨深度强化学习关键论文列表

中国人工智能学会

17+阅读 · 2018年11月10日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

EPTAS for stable allocations in matching games

Arxiv

0+阅读 · 2021年7月20日

The complexity of approximating averages on bounded-degree graphs

Arxiv

0+阅读 · 2021年7月19日

Polynomial Time Algorithms to Find an Approximate Competitive Equilibrium for Chores

Arxiv

0+阅读 · 2021年7月17日

On Efficient Optimal Transport: An Analysis of Greedy and Accelerated Mirror Descent Algorithms

Arxiv

0+阅读 · 2021年7月17日

Data-informed Deep Optimization

Arxiv

0+阅读 · 2021年7月17日

On the stability of robust dynamical low-rank approximations for hyperbolic problems

On the stability of robust dynamical low-rank approximations for hyperbolic problems

Arxiv

0+阅读 · 2021年7月15日

Diffusion Approximations for Thompson Sampling

Arxiv

0+阅读 · 2021年7月15日

Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning

Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning

Arxiv

9+阅读 · 2021年2月23日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

微信扫码咨询专知VIP会员