硬 ODT: 硬件友好在线决定 (Hard-ODT: Hardware-Friendly Online Decision Tree Learning Algorithm and System)

Decision trees are machine learning models commonly used in various application scenarios. In the era of big data, traditional decision tree induction algorithms are not suitable for learning large-scale datasets due to their stringent data storage requirement. Online decision tree learning algorithms have been devised to tackle this problem by concurrently training with incoming samples and providing inference results. However, even the most up-to-date online tree learning algorithms still suffer from either high memory usage or high computational intensity with dependency and long latency, making them challenging to implement in hardware. To overcome these difficulties, we introduce a new quantile-based algorithm to improve the induction of the Hoeffding tree, one of the state-of-the-art online learning models. The proposed algorithm is light-weight in terms of both memory and computational demand, while still maintaining high generalization ability. A series of optimization techniques dedicated to the proposed algorithm have been investigated from the hardware perspective, including coarse-grained and fine-grained parallelism, dynamic and memory-based resource sharing, pipelining with data forwarding. Following this, we present Hard-ODT, a high-performance, hardware-efficient and scalable online decision tree learning system on a field-programmable gate array (FPGA) with system-level optimization techniques. Performance and resource utilization are modeled for the complete learning system for early and fast analysis of the trade-off between various design metrics. Finally, we propose a design flow in which the proposed learning system is applied to FPGA run-time power monitoring as a case study.

翻译：决策树是各种应用情景中常用的机器学习模型。在大数据时代,传统的决策树感应算法因其严格的数据储存要求,不适合学习大型数据集。在线决策树学习算法的设计是为了解决这个问题,同时对收到的样本进行培训,并提供推算结果。但是,即使是最新的在线树学习算法,也仍然由于记忆用量高或具有依赖性和长期耐久性的计算强度高而受到影响,使得它们难以在硬件中实施。为了克服这些困难,我们采用了一种新的基于定量的算法来改进Hoffding树的感应,这是最先进的在线学习模型之一。拟议的算法在记忆和计算需求方面都是轻量的,同时保持高度的概括化能力。从硬件角度对用于拟议算法的一系列优化技术进行了调查,包括粗微和微的平行、动态和基于记忆的资源共享、与数据传输的管道。在此之后,我们提出了硬化模型、高性流动的在线学习模型、高性能、硬件节能的系统,以及用于快速化的实地测试系统,这是我们用于快速的实地学习的系统,这是用于快速升级的系统。