One central goal of design of observational studies is to embed non-experimental data into an approximate randomized controlled trial using statistical matching. Researchers then make the randomization assumption in their downstream, outcome analysis. For matched pair design, the randomization assumption states that the treatment assignment across all matched pairs are independent, and that the probability of the first subject in each pair receiving treatment and the other control is the same as the first receiving control and the other treatment. In this article, we develop a novel framework for testing the randomization assumption based on solving a clustering problem with side-information using modern statistical learning tools. Our testing framework is nonparametric, finite-sample exact, and distinct from previous proposals in that it can be used to test a relaxed version of the randomization assumption called the biased randomization assumption. One important by-product of our testing framework is a quantity called residual sensitivity value (RSV), which quantifies the level of minimal residual confounding due to observed covariates not being well matched. We advocate taking into account RSV in the downstream primary analysis. The proposed methodology is illustrated by re-examining a famous observational study concerning the effect of right heart catheterization (RHC) in the initial care of critically ill patients.
翻译:设计观察研究的一个中心目标是利用统计匹配,将非实验性数据嵌入一个近似随机控制的试验中。研究人员然后在下游结果分析中作出随机化假设。对于对对配设计,随机化假设表明,所有对配配配配配配配配配配配配配的治疗任务是独立的,每个对配对接受治疗的首个对象的概率和另外一种控制的概率与第一个接受控制和其他治疗的概率相同。在本条中,我们开发了一个新框架,用于测试随机化假设,该假设的基础是利用现代统计学习工具用侧信息解决组合问题。我们的测试框架是非参数性、有限抽样准确和与以前的建议不同。这个框架可用于测试随机化假设的宽松版本,称为偏差随机化假设。我们测试框架的一个重要副产品是称为残余灵敏值(RSV)的数量,它量化了观察到的共变异性不匹配的最低余积积度水平。我们主张在下游初级分析中考虑RSV。我们建议的方法通过重新分析对著名的病人进行心脏致癌的初步观察研究来说明。