Estimation and inference on causal parameters is typically reduced to a generalized method of moments problem, which involves auxiliary functions that correspond to solutions to a regression or classification problem. Recent line of work on debiased machine learning shows how one can use generic machine learning estimators for these auxiliary problems, while maintaining asymptotic normality and root-$n$ consistency of the target parameter of interest, while only requiring mean-squared-error guarantees from the auxiliary estimation algorithms. The literature typically requires that these auxiliary problems are fitted on a separate sample or in a cross-fitting manner. We show that when these auxiliary estimation algorithms satisfy natural leave-one-out stability properties, then sample splitting is not required. This allows for sample re-use, which can be beneficial in moderately sized sample regimes. For instance, we show that the stability properties that we propose are satisfied for ensemble bagged estimators, built via sub-sampling without replacement, a popular technique in machine learning practice.
翻译:对因果参数的估计和推论通常被简化为一种通用的时数问题方法,它涉及与回归或分类问题的解决办法相对应的辅助功能。最近关于偏差机器学习的工程线表明,人们如何使用通用机器学习估计器解决这些辅助问题,同时保持目标参数的无症状常态和根值-美元一致性,而仅要求辅助估计算法提供平均定量-高度保障。文献通常要求这些辅助问题安装在单独的样本上或以交叉方式。我们表明,当这些辅助估计算法满足自然的单断层稳定性特性时,就不需要进行样本分离。这样可以使样本重新使用,这在中小的采样制度中是有用的。例如,我们表明,我们所提议的稳定性对于通过不替换即机器学习实践的流行技术,通过子抽样抽样采集而建立的组合式袋式测算器是令人满意的。