The problem of learning from positive and unlabeled data (A.K.A. PU learning) has been studied in a binary (i.e., positive versus negative) classification setting, where the input data consist of (1) observations from the positive class and their corresponding labels, (2) unlabeled observations from both positive and negative classes. Generative Adversarial Networks (GANs) have been used to reduce the problem to the supervised setting with the advantage that supervised learning has state-of-the-art accuracy in classification tasks. In order to generate \textit{pseudo}-negative observations, GANs are trained on positive and unlabeled observations with a modified loss. Using both positive and \textit{pseudo}-negative observations leads to a supervised learning setting. The generation of pseudo-negative observations that are realistic enough to replace missing negative class samples is a bottleneck for current GAN-based algorithms. By including an additional classifier into the GAN architecture, we provide a novel GAN-based approach. In our suggested method, the GAN discriminator instructs the generator only to produce samples that fall into the unlabeled data distribution, while a second classifier (observer) network monitors the GAN training to: (i) prevent the generated samples from falling into the positive distribution; and (ii) learn the features that are the key distinction between the positive and negative observations. Experiments on four image datasets demonstrate that our trained observer network performs better than existing techniques in discriminating between real unseen positive and negative samples.
翻译:在二进制(即正对负)分类设置中研究了从正和无标签数据学习的问题(A.K.A.PU学习),在二进制(即正对负)分类设置中,输入数据包括:(1)正类及其相应标签的观察结果,(2)正类和负类的无标签观察结果,(2)正类和负类的不标签观察结果,利用基因反向网络(GANs)来将问题降低到监督环境下,因为监督学习的优势在分类任务中具有最先进的准确性。为了产生\textit{psedo}反向观察,GANs接受了关于正和无标签观察结果的训练。在我们建议的方法中,GANsindical 观察结果是正和无标签的观察结果,而GANServerv 显示的是正的模型,而GANsrealservs 显示的是正向的模型。