State-of-the-art machine learning solutions mainly focus on creating highly accurate models without constraints on hardware resources. Stream mining algorithms are designed to run on resource-constrained devices, thus a focus on low power and energy and memory-efficient is essential. The Hoeffding tree algorithm is able to create energy-efficient models, but at the cost of less accurate trees in comparison to their ensembles counterpart. Ensembles of Hoeffding trees, on the other hand, create a highly accurate forest of trees but consume five times more energy on average. An extension that tried to obtain similar results to ensembles of Hoeffding trees was the Extremely Fast Decision Tree (EFDT). This paper presents the Green Accelerated Hoeffding Tree (GAHT) algorithm, an extension of the EFDT algorithm with a lower energy and memory footprint and the same (or higher for some datasets) accuracy levels. GAHT grows the tree setting individual splitting criteria for each node, based on the distribution of the number of instances over each particular leaf. The results show that GAHT is able to achieve the same competitive accuracy results compared to EFDT and ensembles of Hoeffding trees while reducing the energy consumption up to 70%.
翻译:最先进的机器学习解决方案主要侧重于创建高度精确且不受硬件资源限制的模型; 流式采矿算法的设计是利用资源限制的装置运行,因此侧重于低电能和记忆效率是不可或缺的。 树木算法能够创建节能模型,但成本成本是低于其组合的树木。 而Hoffing树的组合则创造了一个高度精确的树木森林,但平均消耗的能量要高出5倍。 试图获得与Hoffing树集合类似结果的延伸是极快的决定树(EFDT ) 。 本文展示了绿色加速树(GAHT ) 算法, 扩展了EFDT 算法, 其能量和记忆足迹都较低, 准确度水平相同(或某些数据集更高 ) 。 GAHATT 则根据每种特定叶叶子的分布, 将单个树设定分解标准。 结果显示, GAHATT 能够实现与HEFDT 和 70 REDT 相比相同的竞争性消费结果。