它们并非完全无用: 致力于为分类处理的半超半超学习提供可循环利用的无标签传输数据 (They are Not Completely Useless: Towards Recycling Transferable Unlabeled Data for Class-Mismatched Semi-Supervised Learning)

Semi-Supervised Learning (SSL) with mismatched classes deals with the problem that the classes-of-interests in the limited labeled data is only a subset of the classes in massive unlabeled data. As a result, the classes only possessed by the unlabeled data may mislead the classifier training and thus hindering the realistic landing of various SSL methods. To solve this problem, existing methods usually divide unlabeled data to in-distribution (ID) data and out-of-distribution (OOD) data, and directly discard or weaken the OOD data to avoid their adverse impact. In other words, they treat OOD data as completely useless and thus the potential valuable information for classification contained by them is totally ignored. To remedy this defect, this paper proposes a "Transferable OOD data Recycling" (TOOR) method which properly utilizes ID data as well as the "recyclable" OOD data to enrich the information for conducting class-mismatched SSL. Specifically, TOOR firstly attributes all unlabeled data to ID data or OOD data, among which the ID data are directly used for training. Then we treat the OOD data that have a close relationship with ID data and labeled data as recyclable, and employ adversarial domain adaptation to project them to the space of ID data and labeled data. In other words, the recyclability of an OOD datum is evaluated by its transferability, and the recyclable OOD data are transferred so that they are compatible with the distribution of known classes-of-interests. Consequently, our TOOR method extracts more information from unlabeled data than existing approaches, so it can achieve the improved performance which is demonstrated by the experiments on typical benchmark datasets.

翻译：具有不匹配类的半保密学习(SSL) 解决了以下问题: 有限标签数据中的利益等级只是大量无标签数据中的一个分类子。结果, 仅由未标签数据拥有的类别可能会误导分类培训, 从而妨碍各种 SSL 方法的现实着陆。为了解决这个问题, 现有方法通常将未标签数据分为分配( ID) 数据和分配( OOOD) 数据, 直接丢弃或削弱 OOOD 数据以避免其不利影响。换句话说, 他们将OOOD 数据视为完全不兼容性, 从而完全忽略了它们所含的分类可能有价值的信息。因此, 为了纠正这一缺陷, 仅由未标签数据持有的类别, 将“ 可翻译 OOOD 数据回收” (TOOR) 方法适当地使用ID 数据以及“ 可循环” OOLOD 数据来补充信息。具体地说, ToOR 将所有未标记的数据归为身份数据或 OOD 数据, 其中, ID 数据直接用于直接传输数据。 IM 数据数据转换后, 我们用 IMD 数据数据变数据重新使用数据 IMD 进行数据数据数据数据变数据使用数据数据, 代代代代代代代代数据代代代代代代代代代代代代代代代代代代代代代代代代数据代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代代