Semi-supervised learning (SSL) has been a powerful strategy to incorporate few labels in learning better representations. In this paper, we focus on a practical scenario that one aims to apply SSL when unlabeled data may contain out-of-class samples - those that cannot have one-hot encoded labels from a closed-set of classes in label data, i.e., the unlabeled data is an open-set. Specifically, we introduce OpenCoS, a simple framework for handling this realistic semi-supervised learning scenario based upon a recent framework of self-supervised visual representation learning. We first observe that the out-of-class samples in the open-set unlabeled dataset can be identified effectively via self-supervised contrastive learning. Then, OpenCoS utilizes this information to overcome the failure modes in the existing state-of-the-art semi-supervised methods, by utilizing one-hot pseudo-labels and soft-labels for the identified in- and out-of-class unlabeled data, respectively. Our extensive experimental results show the effectiveness of OpenCoS under the presence of out-of-class samples, fixing up the state-of-the-art semi-supervised methods to be suitable for diverse scenarios involving open-set unlabeled data.
翻译:半监管学习( SSL) 是一个强大的策略, 将少数标签纳入到学习更好的演示中。 在本文中, 我们侧重于一个实用的假设方案, 即当未贴标签的数据可能包含类外样本时, 想要应用 SSL — — 那些无法从标签数据中一组封闭的分类中找到单热编码标签的样本, 即未贴标签的数据是开放的。 具体地说, 我们引入 OpenCOS( OpenCOS), 这是一个简单的框架, 用于根据最近一个自我监督的视觉演示学习框架处理这个现实的半监管学习方案。 我们首先观察到, 开放设置的未贴标签数据集中的类外样本可以通过自我监督对比学习来有效识别。 然后, OpenCOS 利用这些信息来克服当前最先进的半监管方法中的失败模式, 使用单位伪标签和软标签, 分别用于在类内和外部识别的无标签数据。 我们广泛的实验结果显示, 将OnCOS( OpenCOS) 的适合的版本置于不同的版本中, 。