Machine learning regression methods allow estimation of functions without unrealistic parametric assumptions. Although they can perform exceptionally in prediction error, most lack theoretical convergence rates necessary for semi-parametric efficient estimation (e.g. TMLE, AIPW) of parameters like average treatment effects. The Highly Adaptive Lasso (HAL) is the only regression method proven to converge quickly enough for a meaningfully large class of functions, independent of the dimensionality of the predictors. Unfortunately, HAL is not computationally scalable. In this paper we build upon the theory of HAL to construct the Selectively Adaptive Lasso (SAL), a new algorithm which retains HAL's dimension-free, nonparametric convergence rate but which also scales computationally to massive datasets. To accomplish this, we prove some general theoretical results pertaining to empirical loss minimization in nested Donsker classes. Our resulting algorithm is a form of gradient tree boosting with an adaptive learning rate, which makes it fast and trivial to implement with off-the-shelf software. Finally, we show that our algorithm retains the performance of standard gradient boosting on a diverse group of real-world datasets. SAL makes semi-parametric efficient estimators practically possible and theoretically justifiable in many big data settings.
翻译:机器学习回归方法可以在没有不切实际的参数假设的情况下估算函数。 虽然它们可以在预测错误中表现出例外, 但多数人缺乏对平均处理效果等半参数的半参数有效估计( TMLE、 AIPW)所必需的理论趋同率。 高度适应性激光( HAL) 是唯一被证明足以快速聚合的回归方法, 用于有意义的大型功能类别, 独立于预测器的维度。 不幸的是, HAL 不具有计算可缩放性。 在本文中, 我们以HAL理论为基础, 构建了选择性适应性激光( SAL), 这是一种新的算法, 保留了HAL的无尺寸、 非参数趋同率, 但也在计算上以大规模数据集为尺度。 要做到这一点, 我们证明一些一般性的理论结果是, 实验性损失最小化在嵌巢式Donsker 类中是典型的。 我们的算法是一种梯度树增速形式, 适应性学习率使得用现成的软件执行速度和微不足道。 最后, 我们证明我们的算法保留了标准梯度递增性性推力性推力, 在现实- 高度的模型中, 可能 。 SAL 。