Decision tree learning is a widely used approach in machine learning, favoured in applications that require concise and interpretable models. Heuristic methods are traditionally used to quickly produce models with reasonably high accuracy. A commonly criticised point, however, is that the resulting trees may not necessarily be the best representation of the data in terms of accuracy and size. In recent years, this motivated the development of optimal classification tree algorithms that globally optimise the decision tree in contrast to heuristic methods that perform a sequence of locally optimal decisions. We follow this line of work and provide a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our algorithm supports constraints on the depth of the tree and number of nodes. The success of our approach is attributed to a series of specialised techniques that exploit properties unique to classification trees. Whereas algorithms for optimal classification trees have traditionally been plagued by high runtimes and limited scalability, we show in a detailed experimental study that our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances, providing several orders of magnitude improvements and notably contributing towards the practical realisation of optimal decision trees.
翻译:在机器学习中,决策树的学习是一种广泛使用的方法,在需要简明和可解释模型的应用中偏好这种方法;传统上,使用休养方法来快速制作精度较高的模型;然而,一个普遍批评的点是,由此产生的树木不一定是数据精确度和大小方面的最佳体现;近年来,这促使发展最佳的分类树算法,使决策树在全球最优化,而不是执行当地最佳决策顺序的休养方法;我们遵循这一工作路线,为学习基于动态编程和搜索的最佳分类树提供新的算法;我们的算法支持对树木深度和节点数目的限制;我们的方法之所以成功,是因为一系列专门技术利用了树分类所特有的特性;而最佳分类树的算法历来受到高运行时间和伸缩性的困扰,我们在一项详细的实验研究中显示,我们的方法只使用最先进技术所需要的时间的一小部分时间,可以处理数以万计实例的数据集,提供几级级改进,特别是有助于实际实现最佳决策树。