Lifelong learners must recognize concept vocabularies that evolve over time. A common yet underexplored scenario is learning with class labels that continually refine/expand old classes. For example, humans learn to recognize ${\tt dog}$ before dog breeds. In practical settings, dataset $\textit{versioning}$ often introduces refinement to ontologies, such as autonomous vehicle benchmarks that refine a previous ${\tt vehicle}$ class into ${\tt school-bus}$ as autonomous operations expand to new cities. This paper formalizes a protocol for studying the problem of $\textit{Learning with Evolving Class Ontology}$ (LECO). LECO requires learning classifiers in distinct time periods (TPs); each TP introduces a new ontology of "fine" labels that refines old ontologies of "coarse" labels (e.g., dog breeds that refine the previous ${\tt dog}$). LECO explores such questions as whether to annotate new data or relabel the old, how to leverage coarse labels, and whether to finetune the previous TP's model or train from scratch. To answer these questions, we leverage insights from related problems such as class-incremental learning. We validate them under the LECO protocol through the lens of image classification (CIFAR and iNaturalist) and semantic segmentation (Mapillary). Our experiments lead to surprising conclusions; while the current status quo is to relabel existing datasets with new ontologies (such as COCO-to-LVIS or Mapillary1.2-to-2.0), LECO demonstrates that a far better strategy is to annotate $\textit{new}$ data with the new ontology. However, this produces an aggregate dataset with inconsistent old-vs-new labels, complicating learning. To address this challenge, we adopt methods from semi-supervised and partial-label learning. Such strategies can surprisingly be made near-optimal, approaching an "oracle" that learns on the aggregate dataset exhaustively labeled with the newest ontology.
翻译:终身学习者必须认识随时间演变的概念词汇。 一个常见但未得到充分探索的情景是学习课堂标签, 不断完善/ 扩展旧分类。 例如, 在狗繁殖之前, 人类学会识别$[t dog] 美元。 在实际设置中, 数据集$\ textit{ versing} $ 通常会给肿瘤带来改进, 比如自动车辆基准, 将以前的 $ (t) 汽车标本改进为$ (t) 学校- bus) 。 当自动操作扩展至新城市时, 通常会将一个协议正式化, 用于研究 $(tlutit) 和 不断演化的分类问题 。 当我们用新数据或重新标签 将旧的 $(lent) 和 不断变化的变现变现时, 将“ fine” 标签引入新的“ cot” 标签, (eg) 将用旧的变现变现的变现的变现的变现的变现的变现, 和变现的变现的变现的变现的变现的变现的变现的变现的变现的变现的变现的变现的变现的 。