Discovering novel concepts from unlabelled data and in a continuous manner is an important desideratum of lifelong learners. In the literature such problems have been partially addressed under very restricted settings, where either access to labelled data is provided for discovering novel concepts (e.g., NCD) or learning occurs for a limited number of incremental steps (e.g., class-iNCD). In this work we challenge the status quo and propose a more challenging and practical learning paradigm called MSc-iNCD, where learning occurs continuously and unsupervisedly, while exploiting the rich priors from large-scale pre-trained models. To this end, we propose simple baselines that are not only resilient under longer learning scenarios, but are surprisingly strong when compared with sophisticated state-of-the-art methods. We conduct extensive empirical evaluation on a multitude of benchmarks and show the effectiveness of our proposed baselines, which significantly raises the bar.
翻译:发现未标记数据中的新概念并以连续的方式进行学习是终身学习者的重要目标。在文献中,这些问题在非常受限的情况下部分解决,其中提供标记数据以发现新概念(例如NCD)或学习发生在有限数量的增量步骤中(例如类别iNCD)。在这项工作中,我们挑战现状,提出更具挑战性和实用性的学习范式MSc-iNCD,其中学习在连续无监督情况下进行,并利用大规模预训练模型的丰富先验知识。为此,我们提出了简单的基线,不仅在更长的学习场景下具有韧性,而且与精细的最新方法相比表现出惊人的强度。我们在多个基准测试上进行了广泛的实证评估,并展示了我们提出的基线的有效性,大大提升了水平。