Machine learning models deployed in the wild naturally encounter unlabeled samples from both known and novel classes. Challenges arise in learning from both the labeled and unlabeled data, in an open-world semi-supervised manner. In this paper, we introduce a new learning framework, open-world contrastive learning (OpenCon). OpenCon tackles the challenges of learning compact representations for both known and novel classes and facilitates novelty discovery along the way. We demonstrate the effectiveness of OpenCon on challenging benchmark datasets and establish competitive performance. On the ImageNet dataset, OpenCon significantly outperforms the current best method by 11.9% and 7.4% on novel and overall classification accuracy, respectively. Theoretically, OpenCon can be rigorously interpreted from an EM algorithm perspective--minimizing our contrastive loss partially maximizes the likelihood by clustering similar samples in the embedding space. The code is available at https://github.com/deeplearning-wisc/opencon.
翻译:在野生自然中部署的机器学习模型自然会遇到来自已知和新类的未贴标签样本。 在以开放世界半监督的方式从标签和未贴标签的数据中学习方面出现了挑战。 在本文中,我们引入了新的学习框架,开放世界对比学习(OpenCon) 。 开放Con 解决了为已知和新类学习契约表述的挑战,并便利了新颖的发现。 我们展示了开放Con在具有挑战性的基准数据集方面的有效性,并建立了竞争性的性能。 在图像网络数据集中,开放Con 在新颖和总体分类精度方面,分别大大优于目前的最佳方法11.9%和7.4%。理论上,开放Con 可以从EM算法的角度严格解释,将我们的对比损失缩小到最小程度,通过在嵌入空间中将相似的样本组合而使可能性最大化。 代码可在 https://github.com/deepleining-wisc/opencon上查阅。