Exponential tail bounds for sums play an important role in statistics, but the example of the $t$-statistic shows that the exponential tail decay may be lost when population parameters need to be estimated from the data. However, it turns out that if Studentizing is accompanied by estimating the location parameter in a suitable way, then the $t$-statistic regains the exponential tail behavior. Motivated by this example, the paper analyzes other ways of empirically standardizing sums and establishes tail bounds that are sub-Gaussian or even closer to normal for the following settings: Standardization with Studentized contrasts for normal observations, standardization with the log likelihood ratio statistic for observations from an exponential family, and standardization via self-normalization for observations from a symmetric distribution with unknown center of symmetry. The latter standardization gives rise to a novel scan statistic for heteroscedastic data whose asymptotic power is analyzed in the case where the observations have a log-concave distribution.
翻译:数字的指数尾线在统计中起着重要作用,但美元-统计学的例子表明,当需要从数据中估算人口参数时,指数尾线衰减可能会消失。然而,事实证明,如果学生化的同时以适当的方式估算了位置参数,那么美元-统计学就重新恢复了指数尾线行为。根据这个例子,本文分析了实验性地标准化数量的其他方法,并确定了在以下环境中处于亚加西语或甚至更接近正常的尾线:正常观测的标准化与学生化对比:正常观测的标准化与学生化对比,指数家庭观测的日志概率比统计标准化,以及从对称中心不明的对称分布中进行观测的自我标准化。后一种标准化产生了一种新颖的外观数据扫描统计,在观测有日志组合分布的案例中,对这些数据的亚光学能力进行了分析。