In experimental and observational data settings, researchers often have limited knowledge of the reasons for missing outcomes. To address this uncertainty, we propose bounds on causal effects for missing outcomes, accommodating the scenario where missingness is an unobserved mixture of informative and non-informative components. Within this mixed missingness framework, we explore several assumptions to derive bounds on causal effects, including bounds expressed as a function of user-specified sensitivity parameters. We develop influence-function based estimators of these bounds to enable flexible, non-parametric, and machine learning based estimation, achieving root-n convergence rates and asymptotic normality under relatively mild conditions. We further consider the identification and estimation of bounds for other causal quantities that remain meaningful when informative missingness reflects a competing outcome, such as death. We conduct simulation studies and illustrate our methodology with a study on the causal effect of antipsychotic drugs on diabetes risk using a health insurance dataset.
翻译:在实验与观测数据场景中,研究者通常对结局变量缺失的原因了解有限。为应对这种不确定性,我们针对缺失结局变量提出了因果效应的界估计方法,适用于缺失机制为未观测到的信息性与非信息性成分混合的情形。在此混合缺失框架下,我们基于多种假设推导因果效应的界,包括可表达为用户指定敏感性参数函数的界估计。我们开发了基于影响函数的界估计量,以实现灵活、非参数化及基于机器学习的估计,在相对温和条件下达到根号n收敛速率并满足渐近正态性。我们进一步探讨了当信息性缺失反映竞争性结局(如死亡)时仍具有意义的其他因果量的界识别与估计方法。通过模拟研究,并利用健康保险数据集对精神类药物与糖尿病风险因果效应的研究案例,我们展示了所提方法的实际应用。