A confidence sequence (CS) is a sequence of confidence intervals that is valid at arbitrary data-dependent stopping times. These are being employed in an ever-widening scope of applications involving sequential experimentation, such as A/B testing, multi-armed bandits, off-policy evaluation, election auditing, etc. In this paper, we present three approaches to constructing a confidence sequence for the population mean, under the extremely relaxed assumption that only an upper bound on the variance is known. While previous works all rely on stringent tail-lightness assumptions like boundedness or sub-Gaussianity (under which all moments of a distribution exist), the confidence sequences in our work are able to handle data from a wide range of heavy-tailed distributions (where no moment beyond the second is required to exist). Moreover, we show that even under such a simple assumption, the best among our three methods, namely the Catoni-style confidence sequence, performs remarkably well in terms of tightness, essentially matching the best methods for sub-Gaussian data. Our findings have important practical implications when experimenting with unbounded observations, since the finite-variance assumption is often more realistic and easier to verify than sub-Gaussianity.
翻译:信任序列 (CS) 是一个信任序列序列, 它在任意依赖数据的中断时间是有效的。 这些序列被用于一个不断扩大的应用范围,涉及连续实验,例如A/B测试、多武装土匪、离政策评估、选举审计等。 在本文中,我们提出三种方法来为民众构建信任序列意味着,根据极为宽松的假设,即只知道差异的上限;虽然以前的工作都依赖于严格的尾光度假设,如约束性或亚加西数据的最佳方法(在这种假设中,分配的所有时间都存在),我们工作中的信任序列能够处理范围广泛的大量重度分布的数据(在第二个时间之后不需要存在的时间)。 此外,我们表明,即使根据这种简单假设,我们三种方法中的最佳方法,即卡托尼式信任序列,在紧凑性方面表现得相当好,基本上与亚加西数据的最佳方法相匹配。 在试验无限制的观测时,我们的调查结果具有重要的实际影响,因为有限的耐久性假设往往比亚加西的核查更加现实和容易。