合成工具：从稀疏相关到稀疏因果 (The synthetic instrument: From sparse association to sparse causation)

In many observational studies, researchers are often interested in studying the effects of multiple exposures on a single outcome. Standard approaches for high-dimensional data such as the lasso assume the associations between the exposures and the outcome are sparse. These methods, however, do not estimate the causal effects in the presence of unmeasured confounding. In this paper, we consider an alternative approach that assumes the causal effects in view are sparse. We show that with sparse causation, the causal effects are identifiable even with unmeasured confounding. At the core of our proposal is a novel device, called the synthetic instrument, that in contrast to standard instrumental variables, can be constructed using the observed exposures directly. We show that under linear structural equation models, the problem of causal effect estimation can be formulated as an $\ell_0$-penalization problem, and hence can be solved efficiently using off-the-shelf software. Simulations show that our approach outperforms state-of-art methods in both low-dimensional and high-dimensional settings. We further illustrate our method using a mouse obesity dataset.

翻译：在许多观察研究中，研究人员通常有兴趣研究多个指标对单个结果的影响。高维数据的标准方法，例如lasso，假定指标与结果之间的关联是稀疏的。然而，这些方法在存在未测量的混淆因素时无法估计因果效应。在本文中，我们考虑一种替代方法，假定视角内的因果效应是稀疏的。我们表明，通过稀疏因果关系，即使在存在未测量的混淆时，因果效应也是可识别的。在我们的提议的核心是一种新颖的设备，称为合成工具，与标准仪器变量相比，可以直接使用观察到的指标构建。我们表明，在线性结构方程模型下，因果效应估计问题可以被公式化为一个$\ell_0$-惩罚问题，因此可以使用现成的软件高效地解决。模拟表明，我们的方法在低维和高维环境下均优于现有的最新方法。我们进一步使用一组老鼠肥胖数据集来说明我们的方法。