多武装强盗:最佳最佳武器识别和有区别的私人计划 (Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 优化器 · Bandits · 可辨认的 · 样本复杂度 ·

2021 年 5 月 23 日

Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme

翻译：多武装强盗:最佳最佳武器识别和有区别的私人计划

Kontantinos E. Nikolakakis,Dionysios S. Kalogerias,Or Sheffet,Anand D. Sarwate

from arxiv, 18 pages, 4 figures

We study the best-arm identification problem in multi-armed bandits with stochastic, potentially private rewards, when the goal is to identify the arm with the highest quantile at a fixed, prescribed level. First, we propose a (non-private) successive elimination algorithm for strictly optimal best-arm identification, we show that our algorithm is $\delta$-PAC and we characterize its sample complexity. Further, we provide a lower bound on the expected number of pulls, showing that the proposed algorithm is essentially optimal up to logarithmic factors. Both upper and lower complexity bounds depend on a special definition of the associated suboptimality gap, designed in particular for the quantile bandit problem, as we show when the gap approaches zero, best-arm identification is impossible. Second, motivated by applications where the rewards are private, we provide a differentially private successive elimination algorithm whose sample complexity is finite even for distributions with infinite support-size, and we characterize its sample complexity. Our algorithms do not require prior knowledge of either the suboptimality gap or other statistical information related to the bandit problem at hand.

翻译：我们研究的是多武装强盗中具有随机性、潜在的私人奖赏的最好的武器识别问题,当目标是在固定的、规定的水平上辨别手臂与最高四分位数时。首先,我们提出一个(非私营的)连续消除算法,以严格优化最佳武器识别,我们显示我们的算法是$delta$-PAC,我们对其样本复杂性作了区分。此外,我们提供了一个关于预期拉动次数的较低界限,表明提议的算法基本上与对数因素相比是最佳的。高低的复杂界限都取决于相关次优化差距的特殊定义,特别是针对四分位断层问题,正如我们所显示的那样,当差距接近零时,最佳武器识别是不可能的。第二,由于各种应用的动机,当奖励是私人的时,我们提供了一种差别的私人连续消除算法,其样本复杂性即使是在无限支持规模的分布上也是有限的,我们对其样本复杂性加以定性。我们的算法不需要事先知道亚最佳差距或手顶问题的其他统计信息。

0

相关内容

赌博机/老虎机

赌博机/老虎机

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【斯坦福CS330】终身学习: 问题陈述，前后迁移，30页ppt

【斯坦福CS330】终身学习: 问题陈述，前后迁移，30页ppt

专知会员服务

26+阅读 · 2020年12月13日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

专知会员服务

55+阅读 · 2020年3月26日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

普林斯顿大学19年春季学期《机器学习优化》课程讲义

普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知

12+阅读 · 2019年6月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

LibRec 精选：连通知识图谱与推荐系统

LibRec 精选：连通知识图谱与推荐系统

LibRec智能推荐

3+阅读 · 2018年8月9日

已删除

清华大学研究生教育

3+阅读 · 2018年6月30日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

Differentially Private Stochastic Optimization: New Results in Convex and Non-Convex Settings

Arxiv

0+阅读 · 2021年7月13日

Robust Learning of Optimal Auctions

Arxiv

0+阅读 · 2021年7月13日

Quality of Service Guarantees for Physical Unclonable Functions

Arxiv

0+阅读 · 2021年7月12日

Metalearning Linear Bandits by Prior Update

Arxiv

0+阅读 · 2021年7月12日

A stochastic Gauss-Newton algorithm for regularized semi-discrete optimal transport

Arxiv

0+阅读 · 2021年7月12日

Continuous Time Bandits With Sampling Costs

Arxiv

0+阅读 · 2021年7月12日

Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability

Arxiv

0+阅读 · 2021年7月10日

Lower Bounds for Prior Independent Algorithms

Arxiv

0+阅读 · 2021年7月10日

Optimal Gradient-based Algorithms for Non-concave Bandit Optimization

Arxiv

0+阅读 · 2021年7月9日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

VIP会员

文章信息

相关主题

赌博机/老虎机

样本复杂度

相关VIP内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【斯坦福CS330】终身学习: 问题陈述，前后迁移，30页ppt

【斯坦福CS330】终身学习: 问题陈述，前后迁移，30页ppt

专知会员服务

26+阅读 · 2020年12月13日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

专知会员服务

55+阅读 · 2020年3月26日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

普林斯顿大学19年春季学期《机器学习优化》课程讲义

普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知

12+阅读 · 2019年6月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

LibRec 精选：连通知识图谱与推荐系统

LibRec 精选：连通知识图谱与推荐系统

LibRec智能推荐

3+阅读 · 2018年8月9日

已删除

清华大学研究生教育

3+阅读 · 2018年6月30日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

相关论文

Differentially Private Stochastic Optimization: New Results in Convex and Non-Convex Settings

Arxiv

0+阅读 · 2021年7月13日

Robust Learning of Optimal Auctions

Arxiv

0+阅读 · 2021年7月13日

Quality of Service Guarantees for Physical Unclonable Functions

Arxiv

0+阅读 · 2021年7月12日

Metalearning Linear Bandits by Prior Update

Arxiv

0+阅读 · 2021年7月12日

A stochastic Gauss-Newton algorithm for regularized semi-discrete optimal transport

Arxiv

0+阅读 · 2021年7月12日

Continuous Time Bandits With Sampling Costs

Arxiv

0+阅读 · 2021年7月12日

Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability

Arxiv

0+阅读 · 2021年7月10日

Lower Bounds for Prior Independent Algorithms

Arxiv

0+阅读 · 2021年7月10日

Optimal Gradient-based Algorithms for Non-concave Bandit Optimization

Arxiv

0+阅读 · 2021年7月9日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

微信扫码咨询专知VIP会员