Recent studies in Learning to Rank have shown the possibility to effectively distill a neural network from an ensemble of regression trees. This result leads neural networks to become a natural competitor of tree-based ensembles on the ranking task. Nevertheless, ensembles of regression trees outperform neural models both in terms of efficiency and effectiveness, particularly when scoring on CPU. In this paper, we propose an approach for speeding up neural scoring time by applying a combination of Distillation, Pruning and Fast Matrix multiplication. We employ knowledge distillation to learn shallow neural networks from an ensemble of regression trees. Then, we exploit an efficiency-oriented pruning technique that performs a sparsification of the most computationally-intensive layers of the neural network that is then scored with optimized sparse matrix multiplication. Moreover, by studying both dense and sparse high performance matrix multiplication, we develop a scoring time prediction model which helps in devising neural network architectures that match the desired efficiency requirements. Comprehensive experiments on two public learning-to-rank datasets show that neural networks produced with our novel approach are competitive at any point of the effectiveness-efficiency trade-off when compared with tree-based ensembles, providing up to 4x scoring time speed-up without affecting the ranking quality.
翻译:在 " 学习到品级 " 的最近研究显示,有可能从一系列倒退树中有效地提炼神经网络。这一结果导致神经网络成为以树为基础的集合的自然竞争者。然而,从效率和有效性方面,特别是在CPU的评分方面,回归树的集合在效率和有效性方面都超越神经模型。在本文中,我们提出了一个方法,通过将蒸馏、普鲁宁和快速矩阵乘法结合起来,加快神经评分时间。我们利用知识蒸馏来从一个倒退树堆中学习浅神经网络。然后,我们利用一种效率导向的剪裁技术,对最精密的神经网络的计算密集层进行升级,然后以最优化的稀薄矩阵乘法评分。此外,通过研究密集和稀少的高性能矩阵乘法,我们开发了一个评分时间预测模型,帮助设计符合理想效率要求的神经网络结构。在两个公共学习到等级的数据系统上进行综合实验,显示在任何点上,影响我们创新质量的升级方法下,神经网络产生的神经网络在任何一点上都具有竞争力,在任何一点上提供升级的升级效率,在任何点上提供竞争力。