关于最佳武器识别非精确多武装匪徒固定预算 (On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 相同 · MoDELS · Analysis · 散度 ·

2023 年 2 月 6 日

On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits

翻译：关于最佳武器识别非精确多武装匪徒固定预算

Antoine Barrier,Aurélien Garivier,Gilles Stoltz

We lay the foundations of a non-parametric theory of best-arm identification in multi-armed bandits with a fixed budget T. We consider general, possibly non-parametric, models D for distributions over the arms; an overarching example is the model D = P(0,1) of all probability distributions over [0,1]. We propose upper bounds on the average log-probability of misidentifying the optimal arm based on information-theoretic quantities that correspond to infima over Kullback-Leibler divergences between some distributions in D and a given distribution. This is made possible by a refined analysis of the successive-rejects strategy of Audibert, Bubeck, and Munos (2010). We finally provide lower bounds on the same average log-probability, also in terms of the same new information-theoretic quantities; these lower bounds are larger when the (natural) assumptions on the considered strategies are stronger. All these new upper and lower bounds generalize existing bounds based, e.g., on gaps between distributions.

翻译：我们为具有固定预算的多武装匪徒的最佳武器识别方法的非参数理论打下了基础。我们考虑通用的、可能非参数的D型模型,用于武器上分布;一个总括实例是所有概率分布超过[0,1]的模型D=P(0,1),一个总括实例是所有概率分布都大于[0,1]的模型D=P(0,1),我们提议基于信息-理论数量对准Kullback-Leibler对D中某些分布和给定分布之间的异差的信息-理论性平均对准最佳手臂的错误识别的上限值。这是通过对Audibert、Bubeck和Munos(2010年)相继投射目标战略的精细分析而得以实现的。我们最后提供了相同平均逻辑概率的下限值,也是以相同的新信息-理论数量为参照战略的(自然)假设值更强时,这些下限值更大。所有这些新的上限和下限均基于,例如分布之间的差。

0

相关内容

赌博机/老虎机

赌博机/老虎机

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

92+阅读 · 2020年2月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

最大化接收工件总利益的在线排序研究

国家自然科学基金

0+阅读 · 2015年12月31日

台风时空点过程下海洋平台的维修策略研究

国家自然科学基金

0+阅读 · 2015年12月31日

D-serine在癫痫发生中的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

多因素不确定情况下路面最优养护维修策略决策方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

层topos中的拓扑结构与序结构

国家自然科学基金

0+阅读 · 2011年12月31日

多天线OFDM信道全信息压缩估计理论与方法

国家自然科学基金

0+阅读 · 2011年12月31日

蛭石改性乳酸基共聚物机理的研究

国家自然科学基金

0+阅读 · 2010年12月31日

Hexamerin基因家族在飞蝗型变过程中的功能分析

国家自然科学基金

0+阅读 · 2009年12月31日

量子点生物效应的热化学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Genuine multifractality in time series is due to temporal correlations

Arxiv

0+阅读 · 2023年3月29日

Parameterizing the cost function of Dynamic Time Warping with application to time series classification

Arxiv

0+阅读 · 2023年3月29日

On the Local Cache Update Rules in Streaming Federated Learning

Arxiv

0+阅读 · 2023年3月28日

First-order optimization on stratified sets

Arxiv

0+阅读 · 2023年3月28日

Variance Reduction for Matrix Computations with Applications to Gaussian Processes

Arxiv

0+阅读 · 2023年3月26日

Distributionally Robust Multiclass Classification and Applications in Deep Image Classifiers

Arxiv

0+阅读 · 2023年3月25日

On the convergence and sampling of randomized primal-dual algorithms and their application to parallel MRI reconstruction

Arxiv

0+阅读 · 2023年3月24日

Convergence of stochastic gradient descent on parameterized sphere with applications to variational Monte Carlo simulation

Arxiv

0+阅读 · 2023年3月24日

Local Clustering in Contextual Multi-Armed Bandits

Arxiv

0+阅读 · 2023年3月24日

High Dimensional Generalised Penalised Least Squares

Arxiv

0+阅读 · 2023年3月24日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

92+阅读 · 2020年2月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《基于大型语言模型的软件工程自动化研究》最新264页

《基于大型语言模型的信号处理管线研究：推进军事电子情报工作流程》最新76页

中文版 | 战争算法：生成式人工智能在战场的崛起

中文版《美国陆军：战术行为性远程医疗实施观察与建议》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Genuine multifractality in time series is due to temporal correlations

Arxiv

0+阅读 · 2023年3月29日

Parameterizing the cost function of Dynamic Time Warping with application to time series classification

Arxiv

0+阅读 · 2023年3月29日

On the Local Cache Update Rules in Streaming Federated Learning

Arxiv

0+阅读 · 2023年3月28日

First-order optimization on stratified sets

Arxiv

0+阅读 · 2023年3月28日

Variance Reduction for Matrix Computations with Applications to Gaussian Processes

Arxiv

0+阅读 · 2023年3月26日

Distributionally Robust Multiclass Classification and Applications in Deep Image Classifiers

Arxiv

0+阅读 · 2023年3月25日

On the convergence and sampling of randomized primal-dual algorithms and their application to parallel MRI reconstruction

Arxiv

0+阅读 · 2023年3月24日

Convergence of stochastic gradient descent on parameterized sphere with applications to variational Monte Carlo simulation

Arxiv

0+阅读 · 2023年3月24日

Local Clustering in Contextual Multi-Armed Bandits

Arxiv

0+阅读 · 2023年3月24日

High Dimensional Generalised Penalised Least Squares

Arxiv

0+阅读 · 2023年3月24日

相关基金

最大化接收工件总利益的在线排序研究

国家自然科学基金

0+阅读 · 2015年12月31日

台风时空点过程下海洋平台的维修策略研究

国家自然科学基金

0+阅读 · 2015年12月31日

D-serine在癫痫发生中的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

多因素不确定情况下路面最优养护维修策略决策方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

层topos中的拓扑结构与序结构

国家自然科学基金

0+阅读 · 2011年12月31日

多天线OFDM信道全信息压缩估计理论与方法

国家自然科学基金

0+阅读 · 2011年12月31日

蛭石改性乳酸基共聚物机理的研究

国家自然科学基金

0+阅读 · 2010年12月31日

Hexamerin基因家族在飞蝗型变过程中的功能分析

国家自然科学基金

0+阅读 · 2009年12月31日

量子点生物效应的热化学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员