We study the identification and estimation of statistical functionals of multivariate data missing non-monotonically and not-at-random, taking a semiparametric approach. Specifically, we assume that the missingness mechanism satisfies what has been previously called "no self-censoring" or "itemwise conditionally independent nonresponse," which roughly corresponds to the assumption that no partially-observed variable directly determines its own missingness status. We show that this assumption, combined with an odds ratio parameterization of the joint density, enables identification of functionals of interest, and we establish the semiparametric efficiency bound for the nonparametric model satisfying this assumption. We propose a practical augmented inverse probability weighted estimator, and in the setting with a (possibly high-dimensional) always-observed subset of covariates, our proposed estimator enjoys a certain double-robustness property. We explore the performance of our estimator with simulation experiments and on a previously-studied data set of HIV-positive mothers in Botswana.
翻译:具体地说,我们假定缺失机制满足了以前所谓的“不自我检查”或“有条件有条件独立不回应”的假设,这大致符合以下假设:没有部分观测到的变量直接决定其本身的失踪状态。我们表明,这一假设加上联合密度的概率比参数参数化,能够确定感兴趣的功能,我们为符合这一假设的非参数模型设定半对称效率。我们提出一个实际增强的反概率加权估计器,在这种环境中,我们提议的估算器拥有一个(可能高度的)始终观测到的共变数组,我们提议的估算器拥有某种双重破坏特性。我们探索我们的估算器在模拟实验和以前研究过的博茨瓦纳艾滋病毒抗体阳性母亲数据集方面的性能。