In a data stream environment, classification models must handle concept drift efficiently and effectively. Ensemble methods are widely used for this purpose; however, the ones available in the literature either use a large data chunk to update the model or learn the data one by one. In the former, the model may miss the changes in the data distribution, and in the latter, the model may suffer from inefficiency and instability. To address these issues, we introduce a novel ensemble approach based on the Broad Learning System (BLS), where mini chunks are used at each update. BLS is an effective lightweight neural architecture recently developed for incremental learning. Although it is fast, it requires huge data chunks for effective updates, and is unable to handle dynamic changes observed in data streams. Our proposed approach named Broad Ensemble Learning System (BELS) uses a novel updating method that significantly improves best-in-class model accuracy. It employs an ensemble of output layers to address the limitations of BLS and handle drifts. Our model tracks the changes in the accuracy of the ensemble components and react to these changes. We present the mathematical derivation of BELS, perform comprehensive experiments with 20 datasets that demonstrate the adaptability of our model to various drift types, and provide hyperparameter and ablation analysis of our proposed model. Our experiments show that the proposed approach outperforms nine state-of-the-art baselines and supplies an overall improvement of 13.28% in terms of average prequential accuracy.
翻译:在数据流环境中,分类模型必须高效和有效地处理概念的漂移。混合方法被广泛用于此目的;然而,文献中可用的方法要么使用大数据块来更新模型,要么逐项地学习数据。在前者,模型可能错过数据分布的变化,而在后者,模型可能因效率不高和不稳定而受到影响。为了解决这些问题,我们采用了基于宽广学习系统(BLS)的新型混合方法,每份更新都使用小块。BLS是最近为渐进学习而开发的一个有效的轻量神经结构。尽管它很快,它需要巨大的数据块来有效更新模型,无法处理数据流中观察到的动态变化。在前者,模型中称为“宽广组合学习系统”(BELS) 的拟议方法使用新的更新方法,大大改进了最高级模型的准确性。我们采用了一个产出层群集来应对BLS的局限性,并处理漂流方法。我们的模型追踪了组合组成部分的精度变化,并对这些变化作出了反应。虽然它很迅速,但是它需要巨大的数据块块块块块块块块块块块块块块块块块块块,我们的拟议模型的数学模型将用来进行我们的拟议的流化的模型的模型。</s>