Practical machine learning applications involving time series data, such as firewall log analysis to proactively detect anomalous behavior, are concerned with real time analysis of streaming data. Consequently, we need to update the ML models as the statistical characteristics of such data may shift frequently with time. One alternative explored in the literature is to retrain models with updated data whenever the models accuracy is observed to degrade. However, these methods rely on near real time availability of ground truth, which is rarely fulfilled. Further, in applications with seasonal data, temporal concept drift is confounded by seasonal variation. In this work, we propose an approach called Unsupervised Temporal Drift Detector or UTDD to flexibly account for seasonal variation, efficiently detect temporal concept drift in time series data in the absence of ground truth, and subsequently adapt our ML models to concept drift for better generalization.
翻译:涉及时间序列数据的实用机器学习应用程序,如防火墙日志分析,以积极主动地探测异常行为,都与流数据实时分析有关。因此,我们需要更新ML模型,因为这些数据的统计特征可能随着时间而经常变化。文献中探讨的一个替代办法是,在发现模型准确性下降时,用更新数据对模型进行再培训;然而,这些方法依赖几乎实时的地面真实性,但很少实现。此外,在应用季节数据时,时间概念的漂移被季节性变化所困扰。在这项工作中,我们提出了一个称为Un Supered Treft探测器或UDDD 的方法,以灵活地说明季节性变化,在没有地面真实性的情况下,在时间序列数据中高效地探测时间概念漂移,并随后调整我们的ML模型,以概念漂移,以更好地概括化。