Double machine learning is a statistical method for leveraging complex black-box models to construct approximately unbiased treatment effect estimates given observational data with high-dimensional covariates, under the assumption of a partially linear model. The idea is to first fit on a subset of the samples two non-linear predictive models, one for the continuous outcome of interest and one for the observed treatment, and then to estimate a linear coefficient for the treatment using the remaining samples through a simple orthogonalized regression. While this methodology is flexible and can accommodate arbitrary predictive models, typically trained independently of one another, this paper argues that a carefully coordinated learning algorithm for deep neural networks may reduce the estimation bias. The improved empirical performance of the proposed method is demonstrated through numerical experiments on both simulated and real data.
翻译:双机学习是一种统计方法,它利用复杂的黑盒模型,在部分线性模型的假设下,根据高维共变体的观测数据,得出了大致没有偏差的治疗效果估计数,设想是首先将两种非线性预测模型(一种是连续的兴趣结果,一种是观察的治疗方法)纳入样品的一个子集,然后通过简单或分解回归来估计利用剩余样品进行处理的线性系数。虽然这种方法是灵活的,可以容纳任意的预测模型(通常相互独立培训),但本文认为,为深神经网络精心协调的学习算法可以减少估计的偏差。通过模拟数据和真实数据的数字实验,可以证明拟议方法的改进经验表现。