We estimate the linear coefficient in a partially linear model with confounding variables. We rely on double machine learning (DML) and extend it with an additional regularization and selection scheme. We allow for more general dependence structures among the model variables than what has been investigated previously, and we prove that this DML estimator remains asymptotically Gaussian and converges at the parametric rate. The DML estimator has a two-stage least squares interpretation and may produce overly wide confidence intervals. To address this issue, we propose the regularization-selection regsDML method that leads to narrower confidence intervals. It is fully data driven and optimizes an estimated asymptotic mean squared error of the coefficient estimate. Empirical examples demonstrate our methodological and theoretical developments. Software code for our regsDML method will be made available in the R-package dmlalg.
翻译:我们在一个部分线性模型中估算线性系数,其中含有混杂变量。我们依靠双机学习(DML),并通过一个额外的正规化和选择计划来扩展它。我们允许模型变量中比以前调查的参数有更一般的依赖性结构,我们证明DML的估测器仍然是暂时性的,与参数率一致。DML的估测器有两阶段最低方形解释,并可能产生过于广泛的信任间隔。为了解决这个问题,我们建议采用正规化选择 Regs DML方法,从而缩小信任间隔。它完全由数据驱动,优化系数估计的估计的无现性平均正方形错误。经验实例显示了我们的方法和理论发展。我们的regsDML方法的软件代码将在R-package dmlalg中提供。