Treatment effect estimation is a fundamental problem in causal inference. We focus on designing efficient randomized controlled trials, to accurately estimate the effect of some treatment on a population of $n$ individuals. In particular, we study sample-constrained treatment effect estimation, where we must select a subset of $s \ll n$ individuals from the population to experiment on. This subset must be further partitioned into treatment and control groups. Algorithms for partitioning the entire population into treatment and control groups, or for choosing a single representative subset, have been well-studied. The key challenge in our setting is jointly choosing a representative subset and a partition for that set. We focus on both individual and average treatment effect estimation, under a linear effects model. We give provably efficient experimental designs and corresponding estimators, by identifying connections to discrepancy minimization and leverage-score-based sampling used in randomized numerical linear algebra. Our theoretical results obtain a smooth transition to known guarantees when $s$ equals the population size. We also empirically demonstrate the performance of our algorithms.
翻译:在因果推断中,治疗效果估算是一个根本问题。我们的重点是设计有效的随机控制试验,准确估计某种治疗对一美元人口的影响。我们特别研究受抽样限制的治疗效果估算,我们必须从人口中挑选一组一美元/美元/美元个人进行实验。这个子组必须进一步分为治疗和控制组。将整个人口分为治疗和控制组或选择一个代表组的数值已经研究周全。我们所处的主要挑战是共同选择一个代表子集和该组的分区。我们注重在线性效果模型下的个人和平均治疗效果估算。我们提供相当高效的实验设计和相应的估计值,方法是查明在随机数字线性线性代数代数中使用的差异最小和基于杠杆的抽样的关联。我们的理论结果在美元等于人口规模时可以顺利过渡到已知的保证。我们还以经验方式展示了我们算法的绩效。