In this study, we present an incremental machine learning framework called Adaptive Decision Forest (ADF), which produces a decision forest to classify new records. Based on our two novel theorems, we introduce a new splitting strategy called iSAT, which allows ADF to classify new records even if they are associated with previously unseen classes. ADF is capable of identifying and handling concept drift; it, however, does not forget previously gained knowledge. Moreover, ADF is capable of handling big data if the data can be divided into batches. We evaluate ADF on five publicly available natural data sets and one synthetic data set, and compare the performance of ADF against the performance of eight state-of-the-art techniques. Our experimental results, including statistical sign test and Nemenyi test analyses, indicate a clear superiority of the proposed framework over the state-of-the-art techniques.
翻译:在这项研究中,我们提出了一个称为适应决策森林(ADF)的渐进式机器学习框架,这个框架产生了用于对新记录进行分类的决策森林。根据我们两个新颖的理论,我们引入了一种叫做iSAT的分化战略,允许ADF对新记录进行分类,即使这些记录与以前不为人知的类别有关。ADF能够识别和处理概念的漂移;但是,它不会忘记以前获得的知识。此外,如果数据可以分成几批,ADF有能力处理大数据。我们评估了五套公开的自然数据集和一套合成数据集,并将ADF的性能与八种最新技术的性能进行了比较。我们的实验结果,包括统计标志测试和Nemenyi测试分析,表明拟议框架明显优于最新技术。