Deep neural networks are able to learn multi-layered representation via back propagation (BP). Although the gradient boosting decision tree (GBDT) is effective for modeling tabular data, it is non-differentiable with respect to its input, thus suffering from learning multi-layered representation. In this paper, we propose a framework of learning multi-layered GBDT via BP. We approximate the gradient of GBDT based on linear regression. Specifically, we use linear regression to replace the constant value at each leaf ignoring the contribution of individual samples to the tree structure. In this way, we estimate the gradient for intermediate representations, which facilitates BP for multi-layered GBDT. Experiments show the effectiveness of the proposed method in terms of performance and representation ability. To the best of our knowledge, this is the first work of optimizing multi-layered GBDT via BP. This work provides a new possibility of exploring deep tree based learning and combining GBDT with neural networks.
翻译:深神经网络能够通过回传(BP)学习多层代表。 虽然梯度增强决策树(GBDT)对于模拟表层数据是有效的,但对于其投入是不可区别的,因此受到多层次代表制的困扰。 在本文中,我们提出了一个通过BP学习多层次GBDT的框架。 我们根据线性回归度估计GBDT梯度的梯度。 具体地说,我们用线性回归取代每片叶的不变值,忽略单个样本对树结构的贡献。 这样,我们估计了中间代表制的梯度,这为多层GBDT提供了便利。 实验显示了拟议方法在绩效和代表性能力方面的有效性。 据我们所知,这是通过BP优化多层GBDT的首次工作。 这项工作为探索深树基学习和将GBDT与神经网络相结合提供了新的可能性。