在两武装强盗中最佳最佳武器鉴定,在 " 小差距 " 下有固定预算 (Optimal Best Arm Identification in Two-Armed Bandits with a Fixed Budget under a Small Gap) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 优化器 · 估计/估计量 · ARM · 标准差 ·

2022 年 12 月 28 日

Optimal Best Arm Identification in Two-Armed Bandits with a Fixed Budget under a Small Gap

翻译：在两武装强盗中最佳最佳武器鉴定,在 " 小差距 " 下有固定预算

Masahiro Kato,Kaito Ariu,Masaaki Imaizumi,Masahiro Nomura,Chao Qin

We consider fixed-budget best-arm identification in two-armed Gaussian bandit problems. One of the longstanding open questions is the existence of an optimal strategy under which the probability of misidentification matches a lower bound. We show that a strategy following the Neyman allocation rule (Neyman, 1934) is asymptotically optimal when the gap between the expected rewards is small. First, we review a lower bound derived by Kaufmann et al. (2016). Then, we propose the "Neyman Allocation (NA)-Augmented Inverse Probability weighting (AIPW)" strategy, which consists of the sampling rule using the Neyman allocation with an estimated standard deviation and the recommendation rule using an AIPW estimator. Our proposed strategy is optimal because the upper bound matches the lower bound when the budget goes to infinity and the gap goes to zero.

翻译：我们把固定预算最佳武器识别方法视为两臂高山土匪问题。长期存在的一个未决问题是存在一种最佳战略,根据这一战略,误认的可能性与较低约束值相符。我们显示,在预期收益之间的差距很小时,采用奈曼分配规则(1934年,内曼分配规则(1934年,内曼分配规则)的策略是微不足道的最佳战略。首先,我们审查考夫曼等人(2016年)得出的较低约束值。然后,我们提出“奈曼分配(NA)强化反可变性加权法(AIPW)”战略,其中包括使用尼曼分配的抽样规则,使用估计标准偏差,以及使用AIPW估计代号的建议规则。我们提出的战略是最佳的,因为上限值在预算达到无限和差距达到零时与较低约束值相符。

0

相关内容

赌博机/老虎机

赌博机/老虎机

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

III型AtOFPs转录因子对拟南芥荚果形态的调控

国家自然科学基金

0+阅读 · 2014年12月31日

脂肪酸去饱和酶（SCD1）基因及调控其基因网络对牦牛乳中不饱和脂肪酸调节机理

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

SIRPa对心肌肥厚的影响及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

自旋轨道耦合超冷费米原子气体

国家自然科学基金

0+阅读 · 2012年12月31日

组团参加国际光学联合会大会

国家自然科学基金

0+阅读 · 2012年8月18日

High Probability Convergence of Stochastic Gradient Methods

Arxiv

0+阅读 · 2023年2月28日

Dealing with Collinearity in Large-Scale Linear System Identification Using Gaussian Regression

Arxiv

0+阅读 · 2023年2月28日

On-the-Fly Communication-and-Computing for Distributed Tensor Decomposition Over MIMO Channels

Arxiv

0+阅读 · 2023年2月28日

Design-Based Inference for Multi-arm Bandits

Arxiv

0+阅读 · 2023年2月27日

The Freshness Game: Timely Communications in the Presence of an Adversary

The Freshness Game: Timely Communications in the Presence of an Adversary

Arxiv

0+阅读 · 2023年2月27日

Improved Best-of-Both-Worlds Guarantees for Multi-Armed Bandits: FTRL with General Regularizers and Multiple Optimal Arms

Arxiv

0+阅读 · 2023年2月27日

Differentially Private Algorithms for the Stochastic Saddle Point Problem with Optimal Rates for the Strong Gap

Arxiv

0+阅读 · 2023年2月24日

Practical Considerations in Direct Detection Under Tukey Signalling

Arxiv

0+阅读 · 2023年2月24日

Personalized Pricing with Invalid Instrumental Variables: Identification, Estimation, and Policy Learning

Arxiv

0+阅读 · 2023年2月24日

A High-dimensional Convergence Theorem for U-statistics with Applications to Kernel-based Testing

Arxiv

0+阅读 · 2023年2月24日

VIP会员

文章信息

相关主题

赌博机/老虎机

估计/估计量

相关VIP内容

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

自动驾驶轨迹规划中的基础模型：进展综述与开放挑战

《用于提升多域战备的大型语言模型辅助场景生成器》报告

【斯坦福博士论文】为人类使用优化 AI 模型

国防领域人工智能规模化应用的理论与实践

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

High Probability Convergence of Stochastic Gradient Methods

Arxiv

0+阅读 · 2023年2月28日

Dealing with Collinearity in Large-Scale Linear System Identification Using Gaussian Regression

Arxiv

0+阅读 · 2023年2月28日

On-the-Fly Communication-and-Computing for Distributed Tensor Decomposition Over MIMO Channels

Arxiv

0+阅读 · 2023年2月28日

Design-Based Inference for Multi-arm Bandits

Arxiv

0+阅读 · 2023年2月27日

The Freshness Game: Timely Communications in the Presence of an Adversary

The Freshness Game: Timely Communications in the Presence of an Adversary

Arxiv

0+阅读 · 2023年2月27日

Improved Best-of-Both-Worlds Guarantees for Multi-Armed Bandits: FTRL with General Regularizers and Multiple Optimal Arms

Arxiv

0+阅读 · 2023年2月27日

Differentially Private Algorithms for the Stochastic Saddle Point Problem with Optimal Rates for the Strong Gap

Arxiv

0+阅读 · 2023年2月24日

Practical Considerations in Direct Detection Under Tukey Signalling

Arxiv

0+阅读 · 2023年2月24日

Personalized Pricing with Invalid Instrumental Variables: Identification, Estimation, and Policy Learning

Arxiv

0+阅读 · 2023年2月24日

A High-dimensional Convergence Theorem for U-statistics with Applications to Kernel-based Testing

Arxiv

0+阅读 · 2023年2月24日

相关基金

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

III型AtOFPs转录因子对拟南芥荚果形态的调控

国家自然科学基金

0+阅读 · 2014年12月31日

脂肪酸去饱和酶（SCD1）基因及调控其基因网络对牦牛乳中不饱和脂肪酸调节机理

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

SIRPa对心肌肥厚的影响及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

自旋轨道耦合超冷费米原子气体

国家自然科学基金

0+阅读 · 2012年12月31日

组团参加国际光学联合会大会

国家自然科学基金

0+阅读 · 2012年8月18日

微信扫码咨询专知VIP会员