在 " 小差距 " 下以背景信息进行最佳武器识别 (Best Arm Identification with Contextual Information under a Small Gap) - 专知论文

会员服务 ·

0

ARM · 可辨认的 · 赌博机/老虎机 · INFORMS · MoDELS ·

2023 年 1 月 4 日

Best Arm Identification with Contextual Information under a Small Gap

翻译：在 " 小差距 " 下以背景信息进行最佳武器识别

Masahiro Kato,Masaaki Imaizumi,Takuya Ishihara,Toru Kitagawa

from arxiv, For the sake of completeness, we show a part of the results of Kato et al. (arXiv:2201.04469). arXiv admin note: text overlap with arXiv:2201.04469

We study the best-arm identification (BAI) problem with a fixed budget and contextual (covariate) information. In each round of an adaptive experiment, after observing contextual information, we choose a treatment arm using past observations and current context. Our goal is to identify the best treatment arm, which is a treatment arm with the maximal expected reward marginalized over the contextual distribution, with a minimal probability of misidentification. In this study, we consider a class of nonparametric bandit models that converge to location-shift models when the gaps go to zero. First, we derive lower bounds of the misidentification probability for a certain class of strategies and bandit models (probabilistic models of potential outcomes) under a small-gap regime. A small-gap regime is a situation where gaps of the expected rewards between the best and suboptimal treatment arms go to zero, which corresponds to one of the worst cases in identifying the best treatment arm. We then develop the ``Random Sampling (RS)-Augmented Inverse Probability weighting (AIPW) strategy,'' which is asymptotically optimal in the sense that the probability of misidentification under the strategy matches the lower bound when the budget goes to infinity in the small-gap regime. The RS-AIPW strategy consists of the RS rule tracking a target sample allocation ratio and the recommendation rule using the AIPW estimator.

翻译：我们用固定预算和背景(变量)信息研究最佳武器识别(BAI)问题。在每一轮适应性实验中,在观察背景信息之后,我们使用以往的观察和当前背景来选择一个治疗臂。我们的目标是确定最佳治疗臂,这是一个治疗臂,在环境分布上处于最大预期报酬边缘,最差的识别可能性最小。在本研究中,我们考虑的是一类非对称土匪模型,在差距降至零时,这些模型会与地点易位模型相融合。首先,我们发现在小规模制度下,某类战略和强盗模式(潜在结果的概率模型)的错误识别概率(潜在结果的概率模型)的界限较低。一个小型加宽制度是最佳和次优待遇武器之间预期报酬差距达零,相当于确定最佳治疗臂的最坏情况之一。然后我们开发了“兰多姆抽样(RS)”的反差比重战略(AIPW),在采用较低的设想性模式时,将区域投资方案的最佳战略的概率与区域投资方案的最佳战略相匹配。

0

相关内容

ARM

安谋控股公司，又称ARM公司，跨国性半导体设计与软件公司，总部位于英国英格兰剑桥。主要的产品是ARM架构处理器的设计，将其以知识产权的形式向客户进行授权，同时也提供软件开发工具。维基百科

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

图像复原中非凸稀疏优化问题的快速算法

国家自然科学基金

0+阅读 · 2015年12月31日

牙周致病菌诱导的调节性B细胞的生成及分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

三维椭圆方程Cauchy问题的正则化方法

国家自然科学基金

0+阅读 · 2013年12月31日

微分方程周期解问题的全局收敛性算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

CD147参与AR调控雄激素非依赖性前列腺癌的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

铜绿假单胞菌 ExoS 毒素蛋白诱导的细胞凋亡信号通路研究

国家自然科学基金

0+阅读 · 2012年12月31日

LL-37经骨髓间充质干细胞携带回输干预耐多药铜绿假单胞菌肺部感染的研究

国家自然科学基金

0+阅读 · 2011年12月31日

广义Hamilton体系下粘性流体的保结构算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

Language Variety Identification with True Labels

Language Variety Identification with True Labels

Arxiv

0+阅读 · 2023年3月2日

A probabilistic peridynamic framework with an application to the study of the statistical size effect

Arxiv

0+阅读 · 2023年3月2日

Model-based Constrained MDP for Budget Allocation in Sequential Incentive Marketing

Arxiv

0+阅读 · 2023年3月2日

Contextual Linear Types for Differential Privacy

Arxiv

0+阅读 · 2023年3月1日

On Parametric Misspecified Bayesian Cramér-Rao bound: An application to linear Gaussian systems

Arxiv

0+阅读 · 2023年3月1日

Information-Theoretic Analysis of Minimax Excess Risk

Arxiv

0+阅读 · 2023年2月28日

An Algorithm and Complexity Results for Causal Unit Selection

Arxiv

0+阅读 · 2023年2月28日

Equivalence relations and $L^p$ distances between time series with application to the Black Summer Australian bushfires

Arxiv

0+阅读 · 2023年2月28日

Dealing with Collinearity in Large-Scale Linear System Identification Using Gaussian Regression

Arxiv

0+阅读 · 2023年2月28日

Design-Based Inference for Multi-arm Bandits

Arxiv

0+阅读 · 2023年2月27日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《跨时空与跨模态学习事件模式构建体系（LESTAT）》57页DARPA研究报告

《面向未来：军事应用中基于人工智能融合的场景分析及其对全球安全的影响》

《电磁（电子）战：英国能力》最新32页报告

《美军条令：斯特赖克步兵步枪排与班作战条令》最新450页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Language Variety Identification with True Labels

Language Variety Identification with True Labels

Arxiv

0+阅读 · 2023年3月2日

A probabilistic peridynamic framework with an application to the study of the statistical size effect

Arxiv

0+阅读 · 2023年3月2日

Model-based Constrained MDP for Budget Allocation in Sequential Incentive Marketing

Arxiv

0+阅读 · 2023年3月2日

Contextual Linear Types for Differential Privacy

Arxiv

0+阅读 · 2023年3月1日

On Parametric Misspecified Bayesian Cramér-Rao bound: An application to linear Gaussian systems

Arxiv

0+阅读 · 2023年3月1日

Information-Theoretic Analysis of Minimax Excess Risk

Arxiv

0+阅读 · 2023年2月28日

An Algorithm and Complexity Results for Causal Unit Selection

Arxiv

0+阅读 · 2023年2月28日

Equivalence relations and $L^p$ distances between time series with application to the Black Summer Australian bushfires

Arxiv

0+阅读 · 2023年2月28日

Dealing with Collinearity in Large-Scale Linear System Identification Using Gaussian Regression

Arxiv

0+阅读 · 2023年2月28日

Design-Based Inference for Multi-arm Bandits

Arxiv

0+阅读 · 2023年2月27日

相关基金

图像复原中非凸稀疏优化问题的快速算法

国家自然科学基金

0+阅读 · 2015年12月31日

牙周致病菌诱导的调节性B细胞的生成及分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

三维椭圆方程Cauchy问题的正则化方法

国家自然科学基金

0+阅读 · 2013年12月31日

微分方程周期解问题的全局收敛性算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

CD147参与AR调控雄激素非依赖性前列腺癌的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

铜绿假单胞菌 ExoS 毒素蛋白诱导的细胞凋亡信号通路研究

国家自然科学基金

0+阅读 · 2012年12月31日

LL-37经骨髓间充质干细胞携带回输干预耐多药铜绿假单胞菌肺部感染的研究

国家自然科学基金

0+阅读 · 2011年12月31日

广义Hamilton体系下粘性流体的保结构算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员