Conventional decision trees have a number of favorable properties, including interpretability, a small computational footprint and the ability to learn from little training data. However, they lack a key quality that has helped fuel the deep learning revolution: that of being end-to-end trainable, and to learn from scratch those features that best allow to solve a given supervised learning problem. Recent work (Kontschieder 2015) has addressed this deficit, but at the cost of losing a main attractive trait of decision trees: the fact that each sample is routed along a small subset of tree nodes only. We here propose a model and Expectation-Maximization training scheme for decision trees that are fully probabilistic at train time, but after a deterministic annealing process become deterministic at test time. We also analyze the learned oblique split parameters on image datasets and show that Neural Networks can be trained at each split node. In summary, we present the first end-to-end learning scheme for deterministic decision trees and present results on par with or superior to published standard oblique decision tree algorithms.
翻译:常规决策树具有一些有利的属性,包括可解释性、小的计算足迹和学习小培训数据的能力。然而,它们缺乏有助于推动深层学习革命的关键质量:即端到端可培训,并从零开始学习最能解决特定监督学习问题的特征。最近的工作(Kontschieder 2015)解决了这一缺陷,但代价是失去了决策树的主要吸引力特征:每个样本都沿着少数树节点进行。我们在此提议一个模型和期望-最大化培训计划,用于在火车时间完全具有概率性的决策树,但经过确定性麻醉过程后在测试时间成为决定性的。我们还分析了在图像数据集上所学到的模糊分解参数,并表明可以在每个分解节点对神经网络进行培训。简而言之,我们介绍确定性决策树的第一个端到端学习计划,并介绍与或优于出版的标准硬决定树算法相同或优于此结果。