The linear coefficient in a partially linear model with confounding variables can be estimated using double machine learning (DML). However, this DML estimator has a two-stage least squares (TSLS) interpretation and may produce overly wide confidence intervals. To address this issue, we propose a regularization and selection scheme, regsDML, which leads to narrower confidence intervals. It selects either the TSLS DML estimator or a regularization-only estimator depending on whose estimated variance is smaller. The regularization-only estimator is tailored to have a low mean squared error. The regsDML estimator is fully data driven. The regsDML estimator converges at the parametric rate, is asymptotically Gaussian distributed, and asymptotically equivalent to the TSLS DML estimator, but regsDML exhibits substantially better finite sample properties. The regsDML estimator uses the idea of k-class estimators, and we show how DML and k-class estimation can be combined to estimate the linear coefficient in a partially linear endogenous model. Empirical examples demonstrate our methodological and theoretical developments. Software code for our regsDML method is available in the R-package dmlalg.
翻译:使用双机学习( DML) 来估计部分线性模型中具有混杂变量的线性模型的线性系数。 但是, DML 测算器有一个两阶段最小正方( TSLS) 的解释, 并可能产生过宽的置信间隔。 为了解决这个问题, 我们建议了一个正规化和选择方案, regs DML, 从而缩小信任间隔。 它选择了 TSLS DML 测算器, 或一个仅规范化的测算器, 取决于其估计差异较小。 仅规范化的测算器是针对一个低平均正方差的。 Regs DML 估测器是完全驱动数据的。 regsDML 估测器在准率上集中了两个阶段。 regsDML 估测器, 在模拟和K- Blasl 的模型中, 我们现有的模型和模型模型模型模型模型模型的模型模型模型模型是用来估算我们目前使用的模型模型的模型模型。