This paper investigates a new online learning problem with doubly-streaming data, where the data streams are described by feature spaces that constantly evolve, with new features emerging and old features fading away. The challenges of this problem are two folds: 1) Data samples ceaselessly flowing in may carry shifted patterns over time, requiring learners to update hence adapt on-the-fly. 2) Newly emerging features are described by very few samples, resulting in weak learners that tend to make error predictions. A plausible idea to overcome the challenges is to establish relationship between the pre-and-post evolving feature spaces, so that an online learner can leverage the knowledge learned from the old features to better the learning performance on the new features. Unfortunately, this idea does not scale up to high-dimensional media streams with complex feature interplay, which suffers an tradeoff between onlineness (biasing shallow learners) and expressiveness(requiring deep learners). Motivated by this, we propose a novel OLD^3S paradigm, where a shared latent subspace is discovered to summarize information from the old and new feature spaces, building intermediate feature mapping relationship. A key trait of OLD^3S is to treat the model capacity as a learnable semantics, yields optimal model depth and parameters jointly, in accordance with the complexity and non-linearity of the input data streams in an online fashion. Both theoretical analyses and empirical studies substantiate the viability and effectiveness of our proposal.
翻译:本文用双流数据调查一个新的在线学习问题,数据流是由不断演变的地貌空间描述的,这些地貌空间对数据流进行了描述,这些地貌空间不断演变,新特征和旧特征逐渐消失。问题的挑战有两个方面:(1) 数据样本不断流动,可能会随时间的变化而变化,要求学习者不断更新,从而在飞行中适应。(2) 鲜少的样本描述了新出现的地貌特征,导致学习者往往作出错误预测。 克服挑战的一个合理想法是建立前和后演变的地貌空间之间的关系,以便在线学习者能够利用从旧地貌中学到的知识,从而改善新地貌上的学习表现。不幸的是,这一想法并没有扩大至具有复杂地貌相互作用的高度媒体流,这需要学习者不断更新。 2) 鲜少的样本描述了新出现的地貌特征,导致学习者往往容易作出错误预测。 我们为此提出了一个新颖的老地貌地貌空间发现共同的潜在的子空间以总结旧地貌空间中的信息,从而建立中间地貌绘图关系。一个具有复杂性的重要的图像,是,在网上分析模型和深度中共同研究中将数据流作为模型,一个最佳的模型。