The notion of concept drift refers to the phenomenon that the distribution, which is underlying the observed data, changes over time. We are interested in an identification of those features, that are most relevant for the observed drift. We distinguish between drift inducing features, for which the observed feature drift cannot be explained by any other feature, and faithfully drifting features, which correlate with the present drift of other features. This notion gives rise to minimal subsets of the feature space, which are able to characterize the observed drift as a whole. We relate this problem to the problems of feature selection and feature relevance learning, which allows us to derive a detection algorithm. We demonstrate its usefulness on different benchmarks.
翻译:概念漂移的概念是指一种现象,即作为观测到的数据基础的分布会随着时间而变化。我们有兴趣查明与观测到的漂移最相关的特征。我们区分了漂移诱因特征和真实的漂移特征,前者是无法用任何其他特征解释所观察到的特征的漂移特征,后者是与其他特征目前漂移相关的。这一概念产生了地物空间的最小子集,能够将观测到的漂移作为一个整体加以定性。我们把这一问题与特征选择和特征关联性学习问题联系起来,从而使我们能够获得一种探测算法。我们在不同的基准上展示了它的实用性。