Unsupervised clustering under domain shift (UCDS) studies how to transfer the knowledge from abundant unlabeled data from multiple source domains to learn the representation of the unlabeled data in a target domain. In this paper, we introduce Prototype-oriented Clustering with Distillation (PCD) to not only improve the performance and applicability of existing methods for UCDS, but also address the concerns on protecting the privacy of both the data and model of the source domains. PCD first constructs a source clustering model by aligning the distributions of prototypes and data. It then distills the knowledge to the target model through cluster labels provided by the source model while simultaneously clustering the target data. Finally, it refines the target model on the target domain data without guidance from the source model. Experiments across multiple benchmarks show the effectiveness and generalizability of our source-private clustering method.
翻译:在域变(UDDS)下不受监督的集群研究如何从多个源域的大量无标签数据中转让知识,以学习目标域内未标签数据的表示方式。在本文件中,我们引入了原型导向型组合与蒸馏(PCD),不仅改进了UDDS现有方法的性能和适用性,还解决了对保护源域数据和模型隐私的关切。PCD首先通过对原型和数据的分布进行对齐,构建了源群组合模型。然后通过源模型提供的集群标签将知识与目标模型进行提取,同时将目标数据组合为一组。最后,在没有源模型指导的情况下,改进目标域数据的目标模型。跨多个基准的实验显示了我们源-私营集群方法的有效性和通用性。