用于重尾尾尾料平均估计的catoni 式信任序列 (Catoni-style confidence sequences for heavy-tailed mean estimation)

A confidence sequence (CS) is a sequence of confidence intervals that is valid at arbitrary data-dependent stopping times. These are useful in applications like A/B testing, multi-armed bandits, off-policy evaluation, election auditing, etc. We present three approaches to constructing a confidence sequence for the population mean, under the assumption that only an upper bound $\sigma^2$ on the variance is known. While previous works rely on light-tail assumptions like boundedness or subGaussianity (under which all moments of a distribution exist), the confidence sequences in our work are able to handle data from a wide range of heavy-tailed distributions. The best among our three methods -- the Catoni-style confidence sequence -- performs remarkably well in practice, essentially matching the state-of-the-art methods for $\sigma^2$-subGaussian data. Our findings have important implications for sequential experimentation with unbounded observations, since the $\sigma^2$-bounded-variance assumption is more realistic and easier to verify than $\sigma^2$-subGaussianity (which implies the former).

翻译：信任序列( CS) 是任意数据依赖性停止时有效的信任间隔序列。这在A/B测试、多武装土匪、非政策性评估、选举审计等应用中有用。我们提出了三种方法来为人口构建信任序列,其前提是对差异只知道最高约束$\sigma=2美元; 虽然先前的工程依赖于像约束性或亚Gaussian(存在分配的所有时刻)这样的轻尾假设, 但对我们工作的信任序列能够处理范围广泛的大量密集分布的数据。我们三种方法中的最佳方法 -- -- Catoni式的信任序列 -- -- 在实践中表现非常出色,基本上与美元=2美元-subGussian数据的最新方法相匹配。我们的研究结果对无限制观测的顺序实验有着重要影响,因为美元=2美元- 约束性假设比美元=2美元-子系统( 意味着前者)更现实和更容易核实数据。