Active learning is widely used to reduce labeling effort and training time by repeatedly querying only the most beneficial samples from unlabeled data. In real-world problems where data cannot be stored indefinitely due to limited storage or privacy issues, the query selection and the model update should be performed as soon as a new data sample is observed. Various online active learning methods have been studied to deal with these challenges; however, there are difficulties in selecting representative query samples and updating the model efficiently. In this study, we propose Message Passing Adaptive Resonance Theory (MPART) for online active semi-supervised learning. The proposed model learns the distribution and topology of the input data online. It then infers the class of unlabeled data and selects informative and representative samples through message passing between nodes on the topological graph. MPART queries the beneficial samples on-the-fly in stream-based selective sampling scenarios, and continuously improve the classification model using both labeled and unlabeled data. We evaluate our model on visual (MNIST, SVHN, CIFAR-10) and audio (NSynth) datasets with comparable query selection strategies and frequencies, showing that MPART significantly outperforms the competitive models in online active learning environments.
翻译:主动学习被广泛用于减少标签工作和培训时间,通过反复查询来自未贴标签数据的最有益样本,减少标签工作和培训时间。在由于有限的存储或隐私问题无法无限期储存数据的真实世界问题中,一旦观察到新的数据样本,即应进行查询选择和模式更新;已经研究各种在线积极学习方法来应对这些挑战;然而,在选择有代表性的查询样本和有效更新模型方面,存在着困难。在本研究中,我们提议为在线主动半监督学习使用信息传递适应共振反应理论(MPART ) 。拟议模型学习在线输入数据的分布和表层学。然后通过在表层图上的节点之间传递信息,来推断未贴标签数据类别,并选择信息丰富和有代表性的样本。MPART在基于流的选择性抽样假设中询问有益的飞行样本,并用标签和无标签的数据不断改进分类模式。我们用视觉模型(MNIST、SVHN、CIFAR-10)和音频(NSynth)来评估在线输入数据的分布和动态选择模式,在可比较的在线选择策略和频率上大量学习。