We present Natural Gradient Boosting (NGBoost), an algorithm which brings probabilistic prediction capability to gradient boosting in a generic way. Predictive uncertainty estimation is crucial in many applications such as healthcare and weather forecasting. Probabilistic prediction, which is the approach where the model outputs a full probability distribution over the entire outcome space, is a natural way to quantify those uncertainties. Gradient Boosting Machines have been widely successful in prediction tasks on structured input data, but a simple boosting solution for probabilistic prediction of real valued outputs is yet to be made. NGBoost is a gradient boosting approach which uses the \emph{Natural Gradient} to address technical challenges that makes generic probabilistic prediction hard with existing gradient boosting methods. Our approach is modular with respect to the choice of base learner, probability distribution, and scoring rule. We show empirically on several regression datasets that NGBoost provides competitive predictive performance of both uncertainty estimates and traditional metrics.
翻译:我们展示了自然梯度推进(NGBoost),这是一种算法,它带来概率预测能力,以通用方式加速梯度。预测性不确定性估算在许多应用中至关重要,如保健和天气预报。概率预测是模型输出出整个结果空间的完全概率分布的一种自然方法,是量化这些不确定性的一种方式。渐进式推进机在结构化输入数据的预测任务方面取得了广泛成功,但对于真实价值产出的概率预测来说,一个简单的增强解决方案尚未制定。NGBoost是一种梯度推进方法,它利用 emph{Natural Gradient} 来应对技术挑战,使通用概率预测与现有的梯度推进方法都变得困难。我们的方法在基础学习器的选择、概率分布和评分规则方面是模块化的。我们从一些回归数据集上展示了经验,即NGBoost提供了不确定性估计和传统指标的竞争性预测性表现。