Randomization testing is a fundamental method in statistics, enabling inferential tasks such as testing for (conditional) independence of random variables, constructing confidence intervals in semiparametric location models, and constructing (by inverting a permutation test) model-free prediction intervals via conformal inference. Randomization tests are exactly valid for any sample size, but their use is generally confined to exchangeable data. Yet in many applications, data is routinely collected adaptively via, e.g., (contextual) bandit and reinforcement learning algorithms or adaptive experimental designs. In this paper we present a general framework for randomization testing on adaptively collected data (despite its non-exchangeability) that uses a novel weighted randomization test, for which we also present novel computationally tractable resampling algorithms for various popular adaptive assignment algorithms, data-generating environments, and types of inferential tasks. Finally, we demonstrate via a range of simulations the efficacy of our framework for both testing and confidence/prediction interval construction.
翻译:随机化测试是统计中的一个基本方法,可以进行推断性任务,例如随机变量(有条件)独立测试,在半参数位置模型中构建信任间隔,以及通过一致推理构建(倒置一个变异测试)无模型的预测间隔。随机化测试对任何样本大小都完全有效,但通常只使用可交换的数据。然而在许多应用中,数据通过(相通)强力和强化学习算法或适应性实验设计等,定期进行适应性收集。在本文中,我们提出了一个对适应性收集的数据(尽管非易交换性)进行随机化测试的一般框架,使用新的加权随机化测试,为此我们还为各种流行的适应性派定算法、数据生成环境以及推断性任务类型提供了新的可计算性重现算算算法。最后,我们通过一系列模拟,展示了我们测试和信心/预测性间隔构建框架的功效。