Exponential tail bounds for sums play an important role in statistics, but the example of the $t$-statistic shows that the exponential tail decay may be lost when population parameters need to be estimated from the data. However, it turns out that if Studentizing is accompanied by estimating the location parameter in a suitable way, then the $t$-statistic regains the exponential tail behavior. Motivated by this example, the paper analyzes other ways of empirically standardizing sums and establishes tail bounds that are sub-Gaussian or even closer to normal for the following settings: Standardization with Studentized contrasts for normal observations, standardization with the log likelihood ratio statistic for observations from an exponential family, and standardization via self-normalization for observations from a symmetric distribution with unknown center of symmetry. The latter standardization gives rise to a novel scan statistic for heteroscedastic data whose asymptotic power is analyzed.
翻译:数字的指数尾部边框在统计中起着重要作用,但美元-统计学的例子表明,当需要从数据中估算人口参数时,指数尾部衰减可能会消失。然而,事实证明,如果学生化的同时以适当的方式估算了位置参数,那么,美元-统计学就重新恢复了指数尾部行为。根据这个例子,本文分析了实验性地使数量标准化的其他方法,并确定了以下环境的尾部边框:正常观测与学生化对比的标准化,指数家庭观测的日志概率比的标准化,以及以未知的对称中心对称分布观测的自我标准化。后一种标准化产生了新颖的关于非典型数据的扫描统计,即分析出其随机能力。