Positive-unlabeled learning refers to the process of training a binary classifier using only positive and unlabeled data. Although unlabeled data can contain positive data, all unlabeled data are regarded as negative data in existing positive-unlabeled learning methods, which resulting in diminishing performance. We provide a new perspective on this problem -- considering unlabeled data as noisy-labeled data, and introducing a new formulation of PU learning as a problem of joint optimization of noisy-labeled data. This research presents a methodology that assigns initial pseudo-labels to unlabeled data which is used as noisy-labeled data, and trains a deep neural network using the noisy-labeled data. Experimental results demonstrate that the proposed method significantly outperforms the state-of-the-art methods on several benchmark datasets.
翻译:阳性无标签学习是指培训仅使用正值和无标签数据的二进制分类器的过程。 虽然未标签的数据可以包含正值数据, 但所有未标签数据都被视为现有正值无标签学习方法中的负值数据, 从而导致性能下降。 我们对这个问题提供了一个新的视角 -- -- 将无标签数据视为杂音标签数据, 并引入新的 PU 学习配方, 将其作为联合优化噪音标签数据的问题。 此研究展示了一种方法, 将初始假标签指定为无标签数据, 用作吵闹标签数据, 并用噪音标签数据培训深层神经网络。 实验结果显示, 拟议的方法大大优于几个基准数据集上的最新方法 。