A. 背景资料在最佳武器识别方面的作用 (The Role of Contextual Information in Best Arm Identification) - 专知论文

会员服务 ·

0

INFORMS · ARM · 边缘化 · 可辨认的 · Bandits ·

2021 年 6 月 26 日

The Role of Contextual Information in Best Arm Identification

翻译：A. 背景资料在最佳武器识别方面的作用

Masahiro Kato,Kaito Ariu

We study the best-arm identification problem with fixed confidence when contextual (covariate) information is available in stochastic bandits. Although we can use contextual information in each round, we are interested in the marginalized mean reward over the contextual distribution. Our goal is to identify the best arm with a minimal number of samplings under a given value of the error rate. We show the instance-specific sample complexity lower bounds for the problem. Then, we propose a context-aware version of the "Track-and-Stop" strategy, wherein the proportion of the arm draws tracks the set of optimal allocations and prove that the expected number of arm draws matches the lower bound asymptotically. We demonstrate that contextual information can be used to improve the efficiency of the identification of the best marginalized mean reward compared with the results of Garivier & Kaufmann (2016). We experimentally confirm that context information contributes to faster best-arm identification.

翻译：我们用固定的自信来研究最佳武器识别问题。虽然我们可以在每轮中使用背景信息, 但我们对背景分布的边缘化平均报酬感兴趣。我们的目标是在错误率的某个特定值下以最小数量的抽样来识别最佳武器。我们展示了具体实例样本的复杂性, 从而降低了问题的底线。然后, 我们提出了一个“ 跟踪和停止” 战略的背景认知版本, 其中手臂的比例可以跟踪最佳分配的一套方法, 并证明预期的手臂抽取数量与较低约束值一致。我们证明, 与 Garivier & Kaufmann (2016) 的结果相比, 可以利用背景信息来提高识别最边缘化的平均报酬的效率。我们实验性地确认, 环境信息有助于更快地进行最佳武器识别。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【KDD2021】检索交互机的表格数据预测

专知会员服务

16+阅读 · 2021年8月13日

【PKDD2021】成对偏好学习，109页ppt，Pairwise Preference Learning

专知会员服务

21+阅读 · 2021年6月10日

【CVPR2021】双图层实例分割，大幅提升遮挡处理性能

专知会员服务

18+阅读 · 2021年5月23日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

【加州大学-Liwei Wu博士论文】协同过滤与排序，Advances in Collaborative Filtering and Ranking，150页pdf

【加州大学-Liwei Wu博士论文】协同过滤与排序，Advances in Collaborative Filtering and Ranking，150页pdf

专知会员服务

32+阅读 · 2020年3月1日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

【KDD2019|讲座推荐】在线控制实验结果评估的挑战、最佳实践和陷阱：Challenges, Best Practices and Pitfalls in Evaluating Results of Online Controlled Experiments

【KDD2019|讲座推荐】在线控制实验结果评估的挑战、最佳实践和陷阱：Challenges, Best Practices and Pitfalls in Evaluating Results of Online Controlled Experiments

专知会员服务

4+阅读 · 2019年12月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

LibRec 每周算法：parameter-free contextual bandits (SIGIR'15)

LibRec 每周算法：parameter-free contextual bandits (SIGIR'15)

LibRec智能推荐

5+阅读 · 2017年6月12日

Estimating the variance of Shannon entropy

Arxiv

0+阅读 · 2021年8月30日

Robust Privatization with Non-Specific Tasks and the Optimal Privacy-Utility Tradeoff

Arxiv

0+阅读 · 2021年8月30日

A scoring framework for tiered warnings and multicategorical forecasts based on fixed risk measures

Arxiv

0+阅读 · 2021年8月29日

Self-fulfilling Bandits: Endogeneity Spillover and Dynamic Selection in Algorithmic Decision-making

Arxiv

0+阅读 · 2021年8月28日

A Theoretical Framework for Online Information Search

Arxiv

0+阅读 · 2021年8月20日

Robust Generalization and Safe Query-Specialization in Counterfactual Learning to Rank

Arxiv

3+阅读 · 2021年2月11日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Arxiv

5+阅读 · 2020年4月2日

Task-Free Continual Learning

Arxiv

6+阅读 · 2018年12月10日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Multi-Task Learning with Labeled and Unlabeled Tasks

Arxiv

3+阅读 · 2017年6月8日

VIP会员

文章信息

相关主题

相关VIP内容

【KDD2021】检索交互机的表格数据预测

专知会员服务

16+阅读 · 2021年8月13日

【PKDD2021】成对偏好学习，109页ppt，Pairwise Preference Learning

专知会员服务

21+阅读 · 2021年6月10日

【CVPR2021】双图层实例分割，大幅提升遮挡处理性能

专知会员服务

18+阅读 · 2021年5月23日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

【加州大学-Liwei Wu博士论文】协同过滤与排序，Advances in Collaborative Filtering and Ranking，150页pdf

【加州大学-Liwei Wu博士论文】协同过滤与排序，Advances in Collaborative Filtering and Ranking，150页pdf

专知会员服务

32+阅读 · 2020年3月1日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

【KDD2019|讲座推荐】在线控制实验结果评估的挑战、最佳实践和陷阱：Challenges, Best Practices and Pitfalls in Evaluating Results of Online Controlled Experiments

【KDD2019|讲座推荐】在线控制实验结果评估的挑战、最佳实践和陷阱：Challenges, Best Practices and Pitfalls in Evaluating Results of Online Controlled Experiments

专知会员服务

4+阅读 · 2019年12月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

在无标注条件下适配视觉—语言模型：全面综述

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

LibRec 每周算法：parameter-free contextual bandits (SIGIR'15)

LibRec 每周算法：parameter-free contextual bandits (SIGIR'15)

LibRec智能推荐

5+阅读 · 2017年6月12日

相关论文

Estimating the variance of Shannon entropy

Arxiv

0+阅读 · 2021年8月30日

Robust Privatization with Non-Specific Tasks and the Optimal Privacy-Utility Tradeoff

Arxiv

0+阅读 · 2021年8月30日

A scoring framework for tiered warnings and multicategorical forecasts based on fixed risk measures

Arxiv

0+阅读 · 2021年8月29日

Self-fulfilling Bandits: Endogeneity Spillover and Dynamic Selection in Algorithmic Decision-making

Arxiv

0+阅读 · 2021年8月28日

A Theoretical Framework for Online Information Search

Arxiv

0+阅读 · 2021年8月20日

Robust Generalization and Safe Query-Specialization in Counterfactual Learning to Rank

Arxiv

3+阅读 · 2021年2月11日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Arxiv

5+阅读 · 2020年4月2日

Task-Free Continual Learning

Arxiv

6+阅读 · 2018年12月10日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Multi-Task Learning with Labeled and Unlabeled Tasks

Arxiv

3+阅读 · 2017年6月8日

微信扫码咨询专知VIP会员