Estimating the effects of continuous-valued interventions from observational data is critically important in fields such as climate science, healthcare, and economics. Recent work focuses on designing neural-network architectures and regularization functions to allow for scalable estimation of average and individual-level dose response curves from high-dimensional, large-sample data. Such methodologies assume ignorability (all confounding variables are observed) and positivity (all levels of treatment can be observed for every unit described by a given covariate value), which are especially challenged in the continuous treatment regime. Developing scalable sensitivity and uncertainty analyses that allow us to understand the ignorance induced in our estimates when these assumptions are relaxed receives less attention. Here, we develop a continuous treatment-effect marginal sensitivity model (CMSM) and derive bounds that agree with both the observed data and a researcher-defined level of hidden confounding. We introduce a scalable algorithm to derive the bounds and uncertainty-aware deep models to efficiently estimate these bounds for high-dimensional, large-sample observational data. We validate our methods using both synthetic and real-world experiments. For the latter, we work in concert with climate scientists interested in evaluating the climatological impacts of human emissions on cloud properties using satellite observations from the past 15 years: a finite-data problem known to be complicated by the presence of a multitude of unobserved confounders.
翻译:在气候科学、保健和经济学等领域,最近的工作重点是设计神经网络架构和正规化功能,以便从高维、大抽样数据中对平均和个别剂量反应曲线进行可缩放的估计。这些方法假定可忽略(观测到所有令人困惑的变量),并假定现实性(对特定共差值所描述的每个单位都可观察到所有水平的处理),这些单位在持续治疗制度中尤其受到挑战。发展可变敏感度和不确定性分析,使我们能够理解这些假设放松时在估计中引起的无知,因此这些假设不那么受到重视。在这里,我们开发了一个持续治疗效应边缘敏感度模型(CMSM),并获得符合观测到的数据和研究人员确定的隐藏混杂程度的界限。我们采用了一种可缩放的算法,以得出界限和有不确定性的深度模型,以便有效估计高维度、大型观测数据的界限。我们用合成和现实世界范围的实验方法验证了我们的估算方法。我们利用了不合成和真实世界范围的实验,从已知的过去多云层观测到已知的云层数据,我们用已知的模型评估了过去15年的气候变化影响。