Feature evolvable learning has been widely studied in recent years where old features will vanish and new features will emerge when learning with streams. Conventional methods usually assume that a label will be revealed after prediction at each time step. However, in practice, this assumption may not hold whereas no label will be given at most time steps. A good solution is to leverage the technique of manifold regularization to utilize the previous similar data to assist the refinement of the online model. Nevertheless, this approach needs to store all previous data which is impossible in learning with streams that arrive sequentially in large volume. Thus we need a buffer to store part of them. Considering that different devices may have different storage budgets, the learning approaches should be flexible subject to the storage budget limit. In this paper, we propose a new setting: Storage-Fit Feature-Evolvable streaming Learning (SF$^2$EL) which incorporates the issue of rarely-provided labels into feature evolution. Our framework is able to fit its behavior to different storage budgets when learning with feature evolvable streams with unlabeled data. Besides, both theoretical and empirical results validate that our approach can preserve the merit of the original feature evolvable learning i.e., can always track the best baseline and thus perform well at any time step.
翻译:近些年来,人们广泛研究了古老的特征,随着流学的学习将消失,新的特征将出现。常规方法通常假定在每次时间步骤的预测后会披露标签。然而,在实践中,这一假设可能无法维持,而大多数时间步骤不会给出标签。一个好的解决方案是利用多重正规化技术,利用以前的类似数据来帮助完善在线模型。然而,这一方法需要将以往所有数据储存在学习流中不可能的数据储存起来,这些流流依次而成。因此我们需要一个缓冲来储存其中的一部分。考虑到不同的装置可能有不同的存储预算,学习方法应当灵活,但须遵守存储预算的限制。在本文中,我们提议一个新的设置:存储-功能-可变流学(SF$2$EL),将极少提供的标签问题纳入特征演变。我们的框架能够在学习特性可变流时使其行为适应不同的存储预算。此外,理论和实验结果都确认我们的方法能够保存原始的基线的优点,从而可以一直保持原始的进度。