上下文强盗的实情- 最佳 PAC 算法 (Instance-optimal PAC Algorithms for Contextual Bandits) - 专知论文

会员服务 ·

0

上下文赌博机/上下文老虎机 · 赌博机/老虎机 · PAC学习理论 · Extensibility · 样本复杂度 ·

2022 年 7 月 5 日

Instance-optimal PAC Algorithms for Contextual Bandits

翻译：上下文强盗的实情- 最佳 PAC 算法

Zhaoqi Li,Lillian Ratliff,Houssam Nassif,Kevin Jamieson,Lalit Jain

In the stochastic contextual bandit setting, regret-minimizing algorithms have been extensively researched, but their instance-minimizing best-arm identification counterparts remain seldom studied. In this work, we focus on the stochastic bandit problem in the $(\epsilon,\delta)$-$\textit{PAC}$ setting: given a policy class $\Pi$ the goal of the learner is to return a policy $\pi\in \Pi$ whose expected reward is within $\epsilon$ of the optimal policy with probability greater than $1-\delta$. We characterize the first $\textit{instance-dependent}$ PAC sample complexity of contextual bandits through a quantity $\rho_{\Pi}$, and provide matching upper and lower bounds in terms of $\rho_{\Pi}$ for the agnostic and linear contextual best-arm identification settings. We show that no algorithm can be simultaneously minimax-optimal for regret minimization and instance-dependent PAC for best-arm identification. Our main result is a new instance-optimal and computationally efficient algorithm that relies on a polynomial number of calls to an argmax oracle.

翻译：在调查背景的土匪环境中,对后悔最小化的算法进行了广泛的研究,但是对最能最小化武器识别对应方的试想最小化的算法却很少进行研究。在这项工作中,我们把重点放在$(\ epsilon,\delta)$-$\ textit{PAC}美元设置的随机性土匪问题上:如果政策类别为$\Pi$,学习者的目标是返回一个政策类别$\pin\in\pi$/pi$的政策,其预期的回报在1美元/delta$以上的最佳政策中以美元为单位。我们通过一个数量 $\rho ⁇ Pi}美元来描述背景土匪的第一个 $\ texti- instest- adestable}$ PAC 样本复杂性。我们的主要结果是,一个高效的智能算法,一个高效的模型。

0

相关内容

上下文赌博机/上下文老虎机

上下文赌博机/上下文老虎机

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

极大似然minwise哈希估计子研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于搜索过程知识表示与推理的进化多目标优化算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

胶质瘤细胞恶性表型与代谢模式的相关性及分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Lin28/Let-7调控的骨髓间充质干细胞移植治疗阿尔茨海默病的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

Snai1/slug-miR30a反馈环路对肾小管上皮细胞间质转化的调控

国家自然科学基金

0+阅读 · 2012年12月31日

STIM1突变与核浆钙信号调控

国家自然科学基金

0+阅读 · 2012年12月31日

ZmRop1调控玉米抗甘蔗花叶病毒的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

不对称二元调制信号的增强

国家自然科学基金

0+阅读 · 2008年12月31日

硒和FSH对羊睾丸PHGPx基因表达调控的分子机制

国家自然科学基金

0+阅读 · 2008年12月31日

Uncertainty-Guided Mutual Consistency Learning for Semi-Supervised Medical Image Segmentation

Arxiv

0+阅读 · 2022年8月25日

Uniform error estimate of an asymptotic preserving scheme for the Lévy-Fokker-Planck equation

Arxiv

0+阅读 · 2022年8月25日

Estimating means of bounded random variables by betting

Arxiv

0+阅读 · 2022年8月25日

Nonparametric adaptive control and prediction: theory and randomized algorithms

Arxiv

0+阅读 · 2022年8月25日

Sampling and Optimal Preference Elicitation in Simple Mechanisms

Arxiv

0+阅读 · 2022年8月24日

Maximum Likelihood on the Joint (Data, Condition) Distribution for Solving Ill-Posed Problems with Conditional Flow Models

Arxiv

0+阅读 · 2022年8月24日

Collaborative Algorithms for Online Personalized Mean Estimation

Arxiv

0+阅读 · 2022年8月24日

Fast emulation of density functional theory simulations using approximate Gaussian processes

Arxiv

0+阅读 · 2022年8月24日

Semi-supervised Medical Image Segmentation through Dual-task Consistency

Arxiv

14+阅读 · 2020年9月9日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

VIP会员

文章信息

相关主题

上下文赌博机/上下文老虎机

赌博机/老虎机

PAC学习理论

样本复杂度

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Uncertainty-Guided Mutual Consistency Learning for Semi-Supervised Medical Image Segmentation

Arxiv

0+阅读 · 2022年8月25日

Uniform error estimate of an asymptotic preserving scheme for the Lévy-Fokker-Planck equation

Arxiv

0+阅读 · 2022年8月25日

Estimating means of bounded random variables by betting

Arxiv

0+阅读 · 2022年8月25日

Nonparametric adaptive control and prediction: theory and randomized algorithms

Arxiv

0+阅读 · 2022年8月25日

Sampling and Optimal Preference Elicitation in Simple Mechanisms

Arxiv

0+阅读 · 2022年8月24日

Maximum Likelihood on the Joint (Data, Condition) Distribution for Solving Ill-Posed Problems with Conditional Flow Models

Arxiv

0+阅读 · 2022年8月24日

Collaborative Algorithms for Online Personalized Mean Estimation

Arxiv

0+阅读 · 2022年8月24日

Fast emulation of density functional theory simulations using approximate Gaussian processes

Arxiv

0+阅读 · 2022年8月24日

Semi-supervised Medical Image Segmentation through Dual-task Consistency

Arxiv

14+阅读 · 2020年9月9日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

相关基金

极大似然minwise哈希估计子研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于搜索过程知识表示与推理的进化多目标优化算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

胶质瘤细胞恶性表型与代谢模式的相关性及分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Lin28/Let-7调控的骨髓间充质干细胞移植治疗阿尔茨海默病的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

Snai1/slug-miR30a反馈环路对肾小管上皮细胞间质转化的调控

国家自然科学基金

0+阅读 · 2012年12月31日

STIM1突变与核浆钙信号调控

国家自然科学基金

0+阅读 · 2012年12月31日

ZmRop1调控玉米抗甘蔗花叶病毒的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

不对称二元调制信号的增强

国家自然科学基金

0+阅读 · 2008年12月31日

硒和FSH对羊睾丸PHGPx基因表达调控的分子机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员