关于多武装匪徒抽样手段的偏见、风险和一致性 (On the bias, risk and consistency of sample means in multi-armed bandits) - 专知论文

会员服务 ·

0

样本均值 · 有偏 · 赌博机/老虎机 · 均值 · 样本 ·

2021 年 4 月 30 日

On the bias, risk and consistency of sample means in multi-armed bandits

翻译：关于多武装匪徒抽样手段的偏见、风险和一致性

Jaehyeok Shin,Aaditya Ramdas,Alessandro Rinaldo

from arxiv, 48 pages

The sample mean is among the most well studied estimators in statistics, having many desirable properties such as unbiasedness and consistency. However, when analyzing data collected using a multi-armed bandit (MAB) experiment, the sample mean is biased and much remains to be understood about its properties. For example, when is it consistent, how large is its bias, and can we bound its mean squared error? This paper delivers a thorough and systematic treatment of the bias, risk and consistency of MAB sample means. Specifically, we identify four distinct sources of selection bias (sampling, stopping, choosing and rewinding) and analyze them both separately and together. We further demonstrate that a new notion of \emph{effective sample size} can be used to bound the risk of the sample mean under suitable loss functions. We present several carefully designed examples to provide intuition on the different sources of selection bias we study. Our treatment is nonparametric and algorithm-agnostic, meaning that it is not tied to a specific algorithm or goal. In a nutshell, our proofs combine variational representations of information-theoretic divergences with new martingale concentration inequalities.

翻译：样本平均值是统计中研究最周密的估算者之一,具有许多可取的属性,如公正性和一致性。然而,在分析使用多武装土匪实验(MAB)收集的数据时,样本平均值是有偏向的,对其属性仍有许多有待理解之处。例如,当样本平均值一致时,其偏向有多大,我们能否约束其平均的方差?本文对MAB样本的偏向、风险和一致性进行了彻底和系统的处理。具体地说,我们找出了选择偏向的四个不同来源(抽样、停止、选择和倒转),并分别和一起分析。我们进一步证明,可以使用新的\emph{有效样本大小的观念来约束样本在适当损失功能下的风险。我们提出了一些精心设计的例子,以提供我们研究的不同选择偏差来源的直观性。我们的处理是非对称和算法的,意思是它与特定的算法或目标没有联系。在螺形结构上,我们的证据将信息- 差异的描述与新的Martingale浓度不平等结合起来。

0

相关内容

样本均值

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

【剑桥大学】统计因果关系的决策理论基础，Decision-theoretic foundations for statistical causality

【剑桥大学】统计因果关系的决策理论基础，Decision-theoretic foundations for statistical causality

专知会员服务

48+阅读 · 2020年5月5日

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

专知会员服务

134+阅读 · 2020年4月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Guaranteed Fixed-Confidence Best Arm Identification in Multi-Armed Bandits: Simple Sequential Elimination Algorithms

Arxiv

0+阅读 · 2021年6月18日

Information criteria for non-normalized models

Arxiv

0+阅读 · 2021年6月18日

Feasible Inference for Stochastic Volatility in Brownian Semistationary Processes

Arxiv

0+阅读 · 2021年6月17日

Optimum Allocation for Adaptive Multi-Wave Sampling in R: The R Package optimall

Optimum Allocation for Adaptive Multi-Wave Sampling in R: The R Package optimall

Arxiv

0+阅读 · 2021年6月17日

Taming Nonconvexity in Kernel Feature Selection---Favorable Properties of the Laplace Kernel

Arxiv

0+阅读 · 2021年6月17日

Minimax Estimation of Partially-Observed Vector AutoRegressions

Arxiv

0+阅读 · 2021年6月17日

Nonparametric regression for locally stationary random fields under stochastic sampling design

Arxiv

0+阅读 · 2021年6月17日

Exponential Approximation of Band-limited Signals from Nonuniform Sampling

Arxiv

0+阅读 · 2021年6月16日

Thompson Sampling for Unimodal Bandits

Arxiv

0+阅读 · 2021年6月16日

Outside the Echo Chamber: Optimizing the Performative Risk

Arxiv

0+阅读 · 2021年6月15日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

【剑桥大学】统计因果关系的决策理论基础，Decision-theoretic foundations for statistical causality

【剑桥大学】统计因果关系的决策理论基础，Decision-theoretic foundations for statistical causality

专知会员服务

48+阅读 · 2020年5月5日

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

专知会员服务

134+阅读 · 2020年4月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Guaranteed Fixed-Confidence Best Arm Identification in Multi-Armed Bandits: Simple Sequential Elimination Algorithms

Arxiv

0+阅读 · 2021年6月18日

Information criteria for non-normalized models

Arxiv

0+阅读 · 2021年6月18日

Feasible Inference for Stochastic Volatility in Brownian Semistationary Processes

Arxiv

0+阅读 · 2021年6月17日

Optimum Allocation for Adaptive Multi-Wave Sampling in R: The R Package optimall

Optimum Allocation for Adaptive Multi-Wave Sampling in R: The R Package optimall

Arxiv

0+阅读 · 2021年6月17日

Taming Nonconvexity in Kernel Feature Selection---Favorable Properties of the Laplace Kernel

Arxiv

0+阅读 · 2021年6月17日

Minimax Estimation of Partially-Observed Vector AutoRegressions

Arxiv

0+阅读 · 2021年6月17日

Nonparametric regression for locally stationary random fields under stochastic sampling design

Arxiv

0+阅读 · 2021年6月17日

Exponential Approximation of Band-limited Signals from Nonuniform Sampling

Arxiv

0+阅读 · 2021年6月16日

Thompson Sampling for Unimodal Bandits

Arxiv

0+阅读 · 2021年6月16日

Outside the Echo Chamber: Optimizing the Performative Risk

Arxiv

0+阅读 · 2021年6月15日

微信扫码咨询专知VIP会员