We define infinitesimal gradient boosting as a limit of the popular tree-based gradient boosting algorithm from machine learning. The limit is considered in the vanishing-learning-rate asymptotic, that is when the learning rate tends to zero and the number of gradient trees is rescaled accordingly. For this purpose, we introduce a new class of randomized regression trees bridging totally randomized trees and Extra Trees and using a softmax distribution for binary splitting. Our main result is the convergence of the associated stochastic algorithm and the characterization of the limiting procedure as the unique solution of a nonlinear ordinary differential equation in a infinite dimensional function space. Infinitesimal gradient boosting defines a smooth path in the space of continuous functions along which the training error decreases, the residuals remain centered and the total variation is well controlled.
翻译:我们把无限的梯度推动定义为机器学习中流行的以树为基础的梯度推动算法的极限。 极限在消失的学习率低的零和梯度树数相应调整的情况下被考虑。 为此,我们引入了一种新的随机回归树类,将完全随机的树木和额外树连接起来,并使用软式最大分配法来分配二进制分裂。 我们的主要结果是相关随机算法的趋同,以及将限制程序定性为无限维功能空间中非线性普通差分方程式的独特解决方案。 无限梯度推动定义了连续函数空间的顺畅路径, 沿此空间, 培训错误减少, 残留物保持中心状态, 整体变异得到很好的控制 。