Randomization testing is a fundamental method in statistics, enabling inferential tasks such as testing for (conditional) independence of random variables, constructing confidence intervals in semiparametric location models, and constructing (by inverting a permutation test) model-free prediction intervals via conformal inference. Randomization tests are exactly valid for any sample size, but their use is generally confined to exchangeable data. Yet in many applications, data is routinely collected adaptively via, e.g., (contextual) bandit and reinforcement learning algorithms or adaptive experimental designs. In this paper we present a general framework for randomization testing on adaptively collected data (despite its non-exchangeability) that uses a novel weighted randomization test, for which we also present novel computationally tractable resampling algorithms for various popular adaptive assignment algorithms, data-generating environments, and types of inferential tasks. Finally, we demonstrate via a range of simulations the efficacy of our framework for both testing and confidence/prediction interval construction.
翻译:随机化测试是统计学中一种基本方法,可以进行推断任务,例如测试随机变量的(条件)独立性,半参数位置模型中构建置信区间,并通过共形推断通过反转排列测试构建无模型预测区间。随机化测试对于任何样本量都是确切有效的,但其使用通常局限于可交换数据。然而,在许多应用中,数据通过例如(上下文)类比和强化学习算法或自适应实验设计进行自适应收集。在本文中,我们提出了一种随机化测试的通用框架,用于适应性收集数据(尽管其是非交换的),该框架使用一种新颖的加权随机化测试。我们还为各种流行的自适应分配算法、数据生成环境和推断任务类型提出了新颖且可计算的重采样算法。最后,我们通过一系列模拟演示了我们的框架在测试和置信区间/预测区间构建方面的有效性。