This paper addresses the following question: given a sample of i.i.d. random variables with finite variance, can one construct an estimator of the unknown mean that performs nearly as well as if the data were normally distributed? One of the most popular examples achieving this goal is the median of means estimator. However, it is inefficient in a sense that the constants in the resulting bounds are suboptimal. We show that a permutation-invariant modification of the median of means estimator admits deviation guarantees that are sharp up to $1+o(1)$ factor if the underlying distribution possesses $3+p$ moments for some $p>0$ and is absolutely continuous with respect to the Lebesgue measure. This result yields potential improvements for a variety of algorithms that rely on the median of means estimator as a building block. At the core of our argument is a new deviation inequality for the U-statistics of order that is allowed to grow with the sample size, a result that could be of independent interest. Finally, we demonstrate that a hybrid of the median of means and Catoni's estimator is capable of achieving sub-Gaussian deviation guarantees with nearly optimal constants assuming just the existence of the second moment.
翻译:本文针对以下的问题: 给 i. i. d. 随机变量样本, 有一定差异的随机变量样本, 一个人能否构建一个未知值的估测器, 其效果几乎和数据正常分布一样? 实现这一目标的最受欢迎的例子之一是手段估测器的中位值。 然而, 其效率低, 其结果是, 由此得出的界限的常数不尽最佳。 我们的论证表明, 手段估测器中位值的变异性允许偏差保证值高至1美元+o(1)美元, 如果基本分布值为3+p美元时点, 则其表现为约美元=0美元, 且与Lebesgue计量值相比绝对持续。 其结果是, 以手段估测器的中位值作为建筑区块, 使各种算法有可能有所改进。 我们的论点的核心是, 允许以样本大小增长的U- 定序值为新的偏差差异性, 其结果可能是独立的。 最后, 我们证明, 手段和 Catoni 的中间值和 Catoni's “ ” 假设最佳偏差几乎能够实现子” 。