持有固定预算的受制约的纯粹探索多武装强盗 (Constrained Pure Exploration Multi-Armed Bandits with a Fixed Budget) - 专知论文

会员服务 ·

0

赌博机/老虎机 · INFORMS · 优化器 · 估计/估计量 · ARM ·

2022 年 11 月 27 日

Constrained Pure Exploration Multi-Armed Bandits with a Fixed Budget

翻译：持有固定预算的受制约的纯粹探索多武装强盗

Fathima Zarin Faizal,Jayakrishnan Nair

from arxiv, 14 pages

We consider a constrained, pure exploration, stochastic multi-armed bandit formulation under a fixed budget. Each arm is associated with an unknown, possibly multi-dimensional distribution and is described by multiple attributes that are a function of this distribution. The aim is to optimize a particular attribute subject to user-defined constraints on the other attributes. This framework models applications such as financial portfolio optimization, where it is natural to perform risk-constrained maximization of mean return. We assume that the attributes can be estimated using samples from the arms' distributions and that these estimators satisfy suitable concentration inequalities. We propose an algorithm called \textsc{Constrained-SR} based on the Successive Rejects framework, which recommends an optimal arm and flags the instance as being feasible or infeasible. A key feature of this algorithm is that it is designed on the basis of an information theoretic lower bound for two-armed instances. We characterize an instance-dependent upper bound on the probability of error under \textsc{Constrained-SR}, that decays exponentially with respect to the budget. We further show that the associated decay rate is nearly optimal relative to an information theoretic lower bound in certain special cases.

翻译：我们考虑的是固定预算下的有限、纯粹的勘探、随机多臂强盗配方。每个手臂都与未知的、可能多维的分布相关,并用此分布功能的多种属性来描述。目的是优化特定属性, 但须受用户定义的限制。这种框架模型应用, 如金融组合优化, 进行风险限制的中值回报最大化是自然的。我们假设这些属性可以使用武器分布的样本来估计, 而这些测量器满足适当的集中不平等。我们提议基于“ 成功拒绝” 框架的算法, 名为“ Textsc{ Constrated- SR} ”, 推荐一种最佳的手臂, 并将场景标为可行或不可行的。这种算法的一个关键特征是, 它的设计基于信息理论性较低、约束双臂回归的最大范围。我们根据在\ textc{Constrac-SR} 下的误差概率确定, 与预算呈指数指数化。我们进一步表明, 相关的衰变率几乎是某种特殊情况下最优的。

0

相关内容

赌博机/老虎机

赌博机/老虎机

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

石墨烯增强铝基复合材料的仿生制备与界面行为研究

国家自然科学基金

0+阅读 · 2015年12月31日

Forward-Looking与Backward-Looking相结合的投资组合管理

国家自然科学基金

1+阅读 · 2014年12月31日

D3h对称性导向聚酰亚胺分子设计及超分子聚集体可调控构筑

国家自然科学基金

0+阅读 · 2014年12月31日

近红外光响应金属有机框架材料及纳米复合材料的制备及机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

铝合金表面激光熔覆稀土镍基合金强化层基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

湿度-温度耦合效应下码头CFRP加固桩基钢筋锈蚀特性试验研究及剩余寿命预测

国家自然科学基金

0+阅读 · 2012年12月31日

视黄醛蛋白Leptosphaeria Rhodopsin中的质子跨膜传递机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

关系的分解与Domain的表示

国家自然科学基金

1+阅读 · 2011年12月31日

基于扫描探针的光电压/电流谱对半导体纳米线局域光伏特性的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Improved Algorithms for Multi-period Multi-class Packing Problems with~Bandit~Feedback

Arxiv

0+阅读 · 2023年1月31日

Constrained Phi-Equilibria

Arxiv

0+阅读 · 2023年1月31日

Multi-Channel Auction Design in the Autobidding World

Arxiv

0+阅读 · 2023年1月31日

Auto-bidding Equilibrium in ROI-Constrained Online Advertising Markets

Arxiv

0+阅读 · 2023年1月30日

SPEED: Experimental Design for Policy Evaluation in Linear Heteroscedastic Bandits

Arxiv

0+阅读 · 2023年1月29日

CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree Search

Arxiv

0+阅读 · 2023年1月29日

(Private) Kernelized Bandits with Distributed Biased Feedback

Arxiv

0+阅读 · 2023年1月28日

Safe Posterior Sampling for Constrained MDPs with Bounded Constraint Violation

Arxiv

0+阅读 · 2023年1月27日

Learning to Counter: Stochastic Feature-based Learning for Diverse Counterfactual Explanations

Arxiv

0+阅读 · 2023年1月27日

The Stochastic Proximal Distance Algorithm

Arxiv

0+阅读 · 2023年1月27日

VIP会员

文章信息

相关主题

赌博机/老虎机

估计/估计量

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Improved Algorithms for Multi-period Multi-class Packing Problems with~Bandit~Feedback

Arxiv

0+阅读 · 2023年1月31日

Constrained Phi-Equilibria

Arxiv

0+阅读 · 2023年1月31日

Multi-Channel Auction Design in the Autobidding World

Arxiv

0+阅读 · 2023年1月31日

Auto-bidding Equilibrium in ROI-Constrained Online Advertising Markets

Arxiv

0+阅读 · 2023年1月30日

SPEED: Experimental Design for Policy Evaluation in Linear Heteroscedastic Bandits

Arxiv

0+阅读 · 2023年1月29日

CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree Search

Arxiv

0+阅读 · 2023年1月29日

(Private) Kernelized Bandits with Distributed Biased Feedback

Arxiv

0+阅读 · 2023年1月28日

Safe Posterior Sampling for Constrained MDPs with Bounded Constraint Violation

Arxiv

0+阅读 · 2023年1月27日

Learning to Counter: Stochastic Feature-based Learning for Diverse Counterfactual Explanations

Arxiv

0+阅读 · 2023年1月27日

The Stochastic Proximal Distance Algorithm

Arxiv

0+阅读 · 2023年1月27日

相关基金

石墨烯增强铝基复合材料的仿生制备与界面行为研究

国家自然科学基金

0+阅读 · 2015年12月31日

Forward-Looking与Backward-Looking相结合的投资组合管理

国家自然科学基金

1+阅读 · 2014年12月31日

D3h对称性导向聚酰亚胺分子设计及超分子聚集体可调控构筑

国家自然科学基金

0+阅读 · 2014年12月31日

近红外光响应金属有机框架材料及纳米复合材料的制备及机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

铝合金表面激光熔覆稀土镍基合金强化层基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

湿度-温度耦合效应下码头CFRP加固桩基钢筋锈蚀特性试验研究及剩余寿命预测

国家自然科学基金

0+阅读 · 2012年12月31日

视黄醛蛋白Leptosphaeria Rhodopsin中的质子跨膜传递机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

关系的分解与Domain的表示

国家自然科学基金

1+阅读 · 2011年12月31日

基于扫描探针的光电压/电流谱对半导体纳米线局域光伏特性的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员