Learning with feature evolution studies the scenario where the features of the data streams can evolve, i.e., old features vanish and new features emerge. Its goal is to keep the model always performing well even when the features happen to evolve. To tackle this problem, canonical methods assume that the old features will vanish simultaneously and the new features themselves will emerge simultaneously as well. They also assume there is an overlapping period where old and new features both exist when the feature space starts to change. However, in reality, the feature evolution could be unpredictable, which means the features can vanish or emerge arbitrarily, causing the overlapping period incomplete. In this paper, we propose a novel paradigm: Prediction with Unpredictable Feature Evolution (PUFE) where the feature evolution is unpredictable. To address this problem, we fill the incomplete overlapping period and formulate it as a new matrix completion problem. We give a theoretical bound on the least number of observed entries to make the overlapping period intact. With this intact overlapping period, we leverage an ensemble method to take the advantage of both the old and new feature spaces without manually deciding which base models should be incorporated. Theoretical and experimental results validate that our method can always follow the best base models and thus realize the goal of learning with feature evolution.
翻译:以地貌进化为学习特征进化, 研究数据流特征可以演化的情景, 即, 老特征消失, 新的特征出现。 目标是保持模型运行良好, 即使特征发生演变。 为了解决这个问题, 直截了当的方法假设旧特征会同时消失, 新特征本身也会同时出现。 他们还假设存在一个重叠的时期, 当特征空间开始变化时, 旧和新特征都存在。 然而, 特性演化可能是不可预测的, 也就是说, 特征会消失或任意出现, 导致重叠期的不完整。 在本文中, 我们提出了一个新的范例: 以无法预测的地貌进化( PUFE) 来预测模式。 为了解决这个问题, 我们填充不完整的重叠期, 并将它写成一个新的矩阵完成问题。 我们从理论上将观察到的最少数量的条目捆绑在一起, 以使重叠期保持不变的时期保持完整。 我们利用一种混合的方法来利用旧的和新的特征空间的优势, 而不是手动地决定应该纳入哪些基本模型。 理论和实验结果验证我们的方法总是能够实现最佳的基本进化模式。