We consider after-study statistical inference for sequentially designed experiments wherein multiple units are assigned treatments for multiple time points using treatment policies that adapt over time. Our goal is to provide inference guarantees for the counterfactual mean at the smallest possible scale -- mean outcome under different treatments for each unit and each time -- with minimal assumptions on the adaptive treatment policy. Without any structural assumptions on the counterfactual means, this challenging task is infeasible due to more unknowns than observed data points. To make progress, we introduce a latent factor model over the counterfactual means that serves as a non-parametric generalization of the non-linear mixed effects model and the bilinear latent factor model considered in prior works. For estimation, we use a non-parametric method, namely a variant of nearest neighbors, and establish a non-asymptotic high probability error bound for the counterfactual mean for each unit and each time. Under regularity conditions, this bound leads to asymptotically valid confidence intervals for the counterfactual mean as the number of units and time points grows to $\infty$ together at suitable rates. We illustrate our theory via several simulations and a case study involving data from a mobile health clinical trial HeartSteps.
翻译:我们考虑在顺序设计实验中进行后研究统计推断,其中多个单位在多个时间点被分配治疗,并且治疗策略会随时间适应。我们的目标是使用最少的对适应性治疗策略的假设,在最小的尺度上提供反事实均值的推断保证 —— 即每个单位和每个时间下不同治疗的平均结果。在没有对反事实均值进行结构性假设的情况下,由于未知数据点超过了观测数据点,这个具有挑战性的任务是不可能的。为了取得进展,我们介绍了一个反事实均值的潜在因子模型,作为非参数一般化的非线性混合效应模型和双线性潜在因子模型的一种。对于估计,我们使用非参数方法,即最近邻算法的一种变体,并为每个单位和每个时间的反事实均值建立了一个高概率误差界限。在规则性条件下,这个界限随着单位数量和时间点数量以适当的速率增长,可以导致反事实均值的渐进有效置信区间。我们通过几个模拟和一个移动健康临床试验HeartSteps的案例研究来说明我们的理论。