Many practical tasks involve sampling sequentially without replacement (WoR) from a finite population of size $N$, in an attempt to estimate some parameter $\theta^\star$. Accurately quantifying uncertainty throughout this process is a nontrivial task, but is necessary because it often determines when we stop collecting samples and confidently report a result. We present a suite of tools for designing confidence sequences (CS) for $\theta^\star$. A CS is a sequence of confidence sets $(C_n)_{n=1}^N$, that shrink in size, and all contain $\theta^\star$ simultaneously with high probability. We present a generic approach to constructing a frequentist CS using Bayesian tools, based on the fact that the ratio of a prior to the posterior at the ground truth is a martingale. We then present Hoeffding- and empirical-Bernstein-type time-uniform CSs and fixed-time confidence intervals for sampling WoR, which improve on previous bounds in the literature and explicitly quantify the benefit of WoR sampling.
翻译:许多实际任务涉及连续取样,而不从一定规模的美元中替换(WoR),以试图估算某些参数$\theta ⁇ star$。准确量化整个过程中的不确定性是一项非三重任务,但之所以有必要,是因为它常常确定当我们停止采集样本时,并有信心地报告结果。我们为$theta ⁇ star$提供了一套设计信任序列的工具(CS)。 CS是一套(C_n)n=1N$的置信套件序列,其规模缩小,所有都包含$\theta ⁇ star$,同时具有很高的概率。我们提出了一个使用Bayesian工具来构建常客式 CS的通用方法,其依据是,在地面的外星之前的比例是martingale。然后我们提出一套用于设计信任序列(CS)的Hoffding-和实证-Bernstein型时间统一 CS和固定时间信任间隔,用于取样WoR的样本,这些套件在文献的以往界限上有所改进,并明确量化WoR取样的好处。