Generalized Category Discovery (GCD) aims to recognize both known and novel categories from a set of unlabeled data, based on another dataset labeled with only known categories. Without considering differences between known and novel categories, current methods learn about them in a coupled manner, which can hurt model's generalization and discriminative ability. Furthermore, the coupled training approach prevents these models transferring category-specific knowledge explicitly from labeled data to unlabeled data, which can lose high-level semantic information and impair model performance. To mitigate above limitations, we present a novel model called Decoupled Prototypical Network (DPN). By formulating a bipartite matching problem for category prototypes, DPN can not only decouple known and novel categories to achieve different training targets effectively, but also align known categories in labeled and unlabeled data to transfer category-specific knowledge explicitly and capture high-level semantics. Furthermore, DPN can learn more discriminative features for both known and novel categories through our proposed Semantic-aware Prototypical Learning (SPL). Besides capturing meaningful semantic information, SPL can also alleviate the noise of hard pseudo labels through semantic-weighted soft assignment. Extensive experiments show that DPN outperforms state-of-the-art models by a large margin on all evaluation metrics across multiple benchmark datasets. Code and data are available at https://github.com/Lackel/DPN.
翻译:通用分类发现法(GCD) 旨在从一组未贴标签的数据中识别已知的和新的类别。根据另一个标签为已知类别和新分类的标签,我们根据另一个标签为已知类别。在不考虑已知类别和新分类之间的差异的情况下,目前的方法可以同时了解这些类别,这可能会损害模型的一般化和区分能力。此外,结合的培训方法阻止这些模型将特定类别知识从标签数据明确转移到未贴标签数据,这可能会失去高层次的语义信息,并损害模型性能。为了减少上述限制,我们提出了一个名为“脱coupled Protocomen 网络(DPN)”的新模式。通过为分类原型设计一个双方匹配问题,DPN不仅能够解析已知和新分类类别的差异,从而有效地实现不同的培训目标,而且还将标签和未贴标签数据中的已知类别与明确转移特定类别知识并获取高层次的语义。此外,DPNPN可以通过我们提议的Smanti-aware Protoclimate (SPL) 学习(SPL) 捕捉到有意义的多语义性识别信息,此外,SPL 还可以通过大型的软级模型上的所有硬质标定的标定型号外标定的所有硬质标定数据。