概率序列序列递减:腐败的斯托卡强盗的最佳武器识别等级 (Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions) - 专知论文

会员服务 ·

0

可辨认的 · 赌博机/老虎机 · Performer · ARM · 逐次减半 ·

2020 年 10 月 15 日

Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions

翻译：概率序列序列递减:腐败的斯托卡强盗的最佳武器识别等级

Zixin Zhong,Wang Chi Cheung,Vincent Y. F. Tan

from arxiv, 18 pages, 3 figures

We consider a best arm identification (BAI) problem for stochastic bandits with adversarial corruptions in the fixed-budget setting of $T$ steps. We design a novel randomized algorithm, Probabilistic Sequential Shrinking$(u)$ (PSS$(u)$), which is agnostic to the amount of corruptions. When the amount of corruptions per step (CPS) is below a threshold, {PSS}$(u)$ identifies the best arm or item with probability tending to $1$ as $T\rightarrow\infty$. Otherwise, the optimality gap of the identified item degrades gracefully with the CPS. We argue that such a bifurcation is necessary. In addition, we show that when the CPS is sufficiently large, no algorithm can achieve a BAI probability tending to $1$ as $T\rightarrow \infty$. In PSS$(u)$, the parameter $u$ serves to balance between the optimality gap and success probability. En route, the injection of randomization is shown to be essential to mitigate the impact of corruptions. Indeed, we show that PSS$(u)$ has a better performance than its deterministic analogue, the Successive Halving (SH) algorithm by Karnin et al. (2013). PSS$(2)$'s performance guarantee matches SH's when there is no corruption. Finally, we identify a term in the exponent of the failure probability of PSS$(u)$ that generalizes the common $H_2$ term for BAI under the fixed-budget setting.

翻译：我们认为,在固定预算($T$)的设置中,对有对抗性腐败的暴徒来说,最好的手臂识别(BAI)问题是最好的手臂识别(BAI)问题。我们设计了一种新的随机算法,即概率序列递减(u)美元(PSS$(u),这与腐败的程度是不可知的。当每一步(CPS)的腐败程度低于阈值时,{PSS}(u)美元确定最好的手臂或物品的概率为$T\rightrow\infty$。否则,所查明的项目的最佳性差会优于CPS。我们认为,这种分解是有必要的。此外,当CPS足够大的时候,任何算法都不可能达到每一步(CPS)的1美元概率($),而PSS(u)的值确定最佳差距和成功概率之间的平衡。在这条路线下,随机化的输入比CPSS公司总成本(x)更能稳定其业绩。

0

相关内容

可辨认的

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

专知会员服务

134+阅读 · 2020年4月14日

【快讯】CVPR2020结果出炉，1470篇上榜，你的paper中了吗？

【快讯】CVPR2020结果出炉，1470篇上榜，你的paper中了吗？

专知会员服务

51+阅读 · 2020年2月24日

【斯坦福大学AAAI2020】跨越因果层次的概率推理，Probabilistic Reasoning across the Causal Hierarchy

【斯坦福大学AAAI2020】跨越因果层次的概率推理，Probabilistic Reasoning across the Causal Hierarchy

专知会员服务

46+阅读 · 2020年1月11日

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

专知会员服务

62+阅读 · 2020年1月11日

【论文】用于推理的概率逻辑神经网络（Probabilistic Logic Neural Networks for Reasoning）

【论文】用于推理的概率逻辑神经网络（Probabilistic Logic Neural Networks for Reasoning）

专知会员服务

104+阅读 · 2019年12月30日

【NeurIPS 2019论文PPT】通过任务感知调制的多模态模型不可知论元学习（Multimodal Model Agnostic Meta-Learning via Task-Aware Modulation）

【NeurIPS 2019论文PPT】通过任务感知调制的多模态模型不可知论元学习（Multimodal Model Agnostic Meta-Learning via Task-Aware Modulation）

专知会员服务

24+阅读 · 2019年12月30日

【NeurIPS2019】基于累加噪声的对抗鲁棒性（Certified Adversarial Robustness with Additive Noise），Changyou Chen

【NeurIPS2019】基于累加噪声的对抗鲁棒性（Certified Adversarial Robustness with Additive Noise），Changyou Chen

专知会员服务

36+阅读 · 2019年11月12日

【DLBM-SS暑期课程】深度学习与贝叶斯方法 Deep Learning and Bayesian Methods

【DLBM-SS暑期课程】深度学习与贝叶斯方法 Deep Learning and Bayesian Methods

专知会员服务

67+阅读 · 2019年11月10日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【IJCAI 2019 Tutorials】基于概率图模型的医疗决策分析（Medical decision analysis with probabilistic graphical models）

【IJCAI 2019 Tutorials】基于概率图模型的医疗决策分析（Medical decision analysis with probabilistic graphical models）

专知会员服务

46+阅读 · 2019年8月10日

已删除

将门创投

3+阅读 · 2020年8月3日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Verification and Parameter Synthesis for Stochastic Systems using Optimistic Optimization

Arxiv

0+阅读 · 2020年12月3日

On the Impossibility of Convergence of Mixed Strategies with No Regret Learning

Arxiv

0+阅读 · 2020年12月3日

Optimal Bi-level Lottery Design for Multi-agent Systems

Arxiv

0+阅读 · 2020年12月3日

Improved Online Algorithms for Knapsack and GAP in the Random Order Model

Arxiv

0+阅读 · 2020年12月1日

Finite-sample analysis of M-estimators using self-concordance

Arxiv

0+阅读 · 2020年11月30日

Online Search with Maximum Clearance

Arxiv

0+阅读 · 2020年11月28日

A Probabilistic Guidance Approach to Swarm-to-Swarm Engagement Problem

Arxiv

0+阅读 · 2020年11月28日

Improved Optimistic Algorithm For The Multinomial Logit Contextual Bandit

Arxiv

0+阅读 · 2020年11月28日

Dynamic inference in probabilistic graphical models

Arxiv

0+阅读 · 2020年11月27日

Consistency testing for robust phase estimation

Arxiv

0+阅读 · 2020年11月26日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

专知会员服务

134+阅读 · 2020年4月14日

【快讯】CVPR2020结果出炉，1470篇上榜，你的paper中了吗？

【快讯】CVPR2020结果出炉，1470篇上榜，你的paper中了吗？

专知会员服务

51+阅读 · 2020年2月24日

【斯坦福大学AAAI2020】跨越因果层次的概率推理，Probabilistic Reasoning across the Causal Hierarchy

【斯坦福大学AAAI2020】跨越因果层次的概率推理，Probabilistic Reasoning across the Causal Hierarchy

专知会员服务

46+阅读 · 2020年1月11日

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

【浙江大学-AAAI2020】领域自适应的对抗损失，Adversarial-Learned Loss for Domain Adaptation

专知会员服务

62+阅读 · 2020年1月11日

【论文】用于推理的概率逻辑神经网络（Probabilistic Logic Neural Networks for Reasoning）

【论文】用于推理的概率逻辑神经网络（Probabilistic Logic Neural Networks for Reasoning）

专知会员服务

104+阅读 · 2019年12月30日

【NeurIPS 2019论文PPT】通过任务感知调制的多模态模型不可知论元学习（Multimodal Model Agnostic Meta-Learning via Task-Aware Modulation）

【NeurIPS 2019论文PPT】通过任务感知调制的多模态模型不可知论元学习（Multimodal Model Agnostic Meta-Learning via Task-Aware Modulation）

专知会员服务

24+阅读 · 2019年12月30日

【NeurIPS2019】基于累加噪声的对抗鲁棒性（Certified Adversarial Robustness with Additive Noise），Changyou Chen

【NeurIPS2019】基于累加噪声的对抗鲁棒性（Certified Adversarial Robustness with Additive Noise），Changyou Chen

专知会员服务

36+阅读 · 2019年11月12日

【DLBM-SS暑期课程】深度学习与贝叶斯方法 Deep Learning and Bayesian Methods

【DLBM-SS暑期课程】深度学习与贝叶斯方法 Deep Learning and Bayesian Methods

专知会员服务

67+阅读 · 2019年11月10日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【IJCAI 2019 Tutorials】基于概率图模型的医疗决策分析（Medical decision analysis with probabilistic graphical models）

【IJCAI 2019 Tutorials】基于概率图模型的医疗决策分析（Medical decision analysis with probabilistic graphical models）

专知会员服务

46+阅读 · 2019年8月10日

热门VIP内容

开通专知VIP会员享更多权益服务

智能体化人工智能：架构、应用及未来发展方向的综合综述

《自主武器》365页书籍

联邦学习综述：多层次聚合技术的系统分类、实验洞察与未来前沿

人工智能在空战中的局限及其真正适用领域

相关资讯

已删除

将门创投

3+阅读 · 2020年8月3日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Verification and Parameter Synthesis for Stochastic Systems using Optimistic Optimization

Arxiv

0+阅读 · 2020年12月3日

On the Impossibility of Convergence of Mixed Strategies with No Regret Learning

Arxiv

0+阅读 · 2020年12月3日

Optimal Bi-level Lottery Design for Multi-agent Systems

Arxiv

0+阅读 · 2020年12月3日

Improved Online Algorithms for Knapsack and GAP in the Random Order Model

Arxiv

0+阅读 · 2020年12月1日

Finite-sample analysis of M-estimators using self-concordance

Arxiv

0+阅读 · 2020年11月30日

Online Search with Maximum Clearance

Arxiv

0+阅读 · 2020年11月28日

A Probabilistic Guidance Approach to Swarm-to-Swarm Engagement Problem

Arxiv

0+阅读 · 2020年11月28日

Improved Optimistic Algorithm For The Multinomial Logit Contextual Bandit

Arxiv

0+阅读 · 2020年11月28日

Dynamic inference in probabilistic graphical models

Arxiv

0+阅读 · 2020年11月27日

Consistency testing for robust phase estimation

Arxiv

0+阅读 · 2020年11月26日

微信扫码咨询专知VIP会员