This paper proposes Differential-Critic Generative Adversarial Network (DiCGAN) to learn the distribution of user-desired data when only partial instead of the entire dataset possesses the desired property. DiCGAN generates desired data that meets the user's expectations and can assist in designing biological products with desired properties. Existing approaches select the desired samples first and train regular GANs on the selected samples to derive the user-desired data distribution. However, the selection of the desired data relies on global knowledge and supervision over the entire dataset. DiCGAN introduces a differential critic that learns from pairwise preferences, which are local knowledge and can be defined on a part of training data. The critic is built by defining an additional ranking loss over the Wasserstein GAN's critic. It endows the difference of critic values between each pair of samples with the user preference and guides the generation of the desired data instead of the whole data. For a more efficient solution to ensure data quality, we further reformulate DiCGAN as a constrained optimization problem, based on which we theoretically prove the convergence of our DiCGAN. Extensive experiments on a diverse set of datasets with various applications demonstrate that our DiCGAN achieves state-of-the-art performance in learning the user-desired data distributions, especially in the cases of insufficient desired data and limited supervision.
翻译:本文提出了一种差分批评生成对抗网络 (DiCGAN) 用于在只有部分而不是整个数据集拥有所需属性时学习用户所需数据的分布。DiCGAN 生成符合用户期望的所需数据,并可以协助设计具有所需属性的生物制品。现有方法首先选择所需样本,然后在所选样本上训练常规GAN以得出用户所需数据分布。但是,所选所需数据的选择依赖于整个数据集的全局知识和监督。DiCGAN 引入了差分批评者,该批评者从成对偏好中学习,这些偏好是本地知识,可以在部分训练数据上定义。该批评者是通过在Wasserstein GAN的批评者上定义额外的排名损失而构建的。它赋予每一对样本之间的批评者值之差以用户偏好,并引导所需数据的生成,而不是整个数据。为了更有效地解决数据质量问题,我们进一步将DiCGAN重新制定为约束优化问题,在此基础上我们在理论上证明了我们的DiCGAN的收敛性。对各种应用程序的多个数据集进行的广泛实验表明,我们的DiCGAN在学习用户所需数据分布方面实现了最先进的性能,特别是在所需数据不足和监督有限的情况下。