Gradient Boosting (GB) is a popular methodology used to solve prediction problems by minimizing a differentiable loss function, $L$. GB performs very well on tabular machine learning (ML) problems; however, as a pure ML solver it lacks the ability to fit models with probabilistic but correlated multi-dimensional outputs, for example, multiple correlated Bernoulli outputs. GB also does not form intermediate abstract data embeddings, one property of Deep Learning that gives greater flexibility and performance on other types of problems. This paper presents a simple adjustment to GB motivated in part by artificial neural networks. Specifically, our adjustment inserts a matrix multiplication between the output of a GB model and the loss, $L$. This allows the output of a GB model to have increased dimension prior to being fed into the loss and is thus ``wider'' than standard GB implementations. We call our method Wide Boosting (WB) and show that WB outperforms GB on mult-dimesional output tasks and that the embeddings generated by WB contain are more useful in downstream prediction tasks than GB output predictions alone.
翻译:加速推力( GB) 是用来解决预测问题的流行方法, 最大限度地减少一个不同的损失函数, 即 $L$. GB 在表格机器学习( ML) 问题上表现非常好; 但是, 作为纯的 ML 解答器, 它没有能力将模型与概率性但相互关联的多维输出相匹配, 例如, 多重关联的伯努利输出。 GB 也不构成中间的抽象数据嵌入, 一种深层学习的属性, 使其他类型的问题具有更大的灵活性和性能。 本文对以人工神经网络为部分动机的GB 做了简单的调整。 具体而言, 我们的调整在GB 模型输出和损失之间插入了一个矩阵乘法, $L$ 。 这样, GB 模型的输出在输入损失之前就具有更大的维度, 从而“ 宽度” 而不是标准的GB 执行 。 我们称之为“ 宽度导力( WB) ”, 并显示WB 在 mult- 输出任务上比GB 差于部分由人工神经网络驱动的 。 。