Adaptive experiment designs can dramatically improve statistical efficiency in randomized trials, but they also complicate statistical inference. For example, it is now well known that the sample mean is biased in adaptive trials. Inferential challenges are exacerbated when our parameter of interest differs from the parameter the trial was designed to target, such as when we are interested in estimating the value of a sub-optimal treatment after running a trial to determine the optimal treatment using a stochastic bandit design. In this context, typical estimators that use inverse propensity weighting to eliminate sampling bias can be problematic: their distributions become skewed and heavy-tailed as the propensity scores decay to zero. In this paper, we present a class of estimators that overcome these issues. Our approach is to adaptively reweight the terms of an augmented inverse propensity weighting estimator to control the contribution of each term to the estimator's variance. This adaptive weighting scheme prevents estimates from becoming heavy-tailed, ensuring asymptotically correct coverage. It also reduces variance, allowing us to test hypotheses with greater power - especially hypotheses that were not targeted by the experimental design. We validate the accuracy of the resulting estimates and their confidence intervals in numerical experiments and show our methods compare favorably to existing alternatives in terms of RMSE and coverage.
翻译:适应性实验设计可以极大地提高随机试验的统计效率,但也会使统计推断复杂化。例如,现在众所周知,抽样平均值在适应性试验中偏差。当我们感兴趣的参数与试验所针对的参数不同时,推断性挑战会加剧,例如,当我们有兴趣在试验后估计亚最佳治疗值时,进行试验以确定最佳治疗的价值,然后用随机强盗设计来决定最佳治疗。在这方面,使用反偏差加权法来消除抽样偏差的典型估测员可能会产生问题:其分布变得偏斜和严重细化,因为倾向性评分衰减到零。在本文中,我们提出了克服这些问题的一类估计者。我们的方法是适应性地重新权衡增加的反敏度加权值,以控制每个术语对估计性差异的贡献。在这方面,这种适应性加权办法可以防止估计数变得粗细,确保覆盖面的精确度得到纠正。它还减少了差异,使我们得以测试目标偏差,从而克服了这些问题。我们的方法是,通过更精确的实验来测试其精确度,我们用更精确的实验方法来验证其现有概率的精确度,我们用更精确的估测算。