Identifying harmful instances, whose absence in a training dataset improves model performance, is important for building better machine learning models. Although previous studies have succeeded in estimating harmful instances under supervised settings, they cannot be trivially extended to generative adversarial networks (GANs). This is because previous approaches require that (1) the absence of a training instance directly affects the loss value and that (2) the change in the loss directly measures the harmfulness of the instance for the performance of a model. In GAN training, however, neither of the requirements is satisfied. This is because, (1) the generator's loss is not directly affected by the training instances as they are not part of the generator's training steps, and (2) the values of GAN's losses normally do not capture the generative performance of a model. To this end, (1) we propose an influence estimation method that uses the Jacobian of the gradient of the generator's loss with respect to the discriminator's parameters (and vice versa) to trace how the absence of an instance in the discriminator's training affects the generator's parameters, and (2) we propose a novel evaluation scheme, in which we assess harmfulness of each training instance on the basis of how GAN evaluation metric (e.g., inception score) is expect to change due to the removal of the instance. We experimentally verified that our influence estimation method correctly inferred the changes in GAN evaluation metrics. Further, we demonstrated that the removal of the identified harmful instances effectively improved the model's generative performance with respect to various GAN evaluation metrics.
翻译:在培训数据集中缺少有助于改进模型性能的有害事例,查明有害事例对于建立更好的机器学习模式十分重要。虽然以前的研究成功地估计了受监督环境中的有害事例,但不能轻描淡写地扩大到基因对抗网络(GANs ) 。这是因为以前的做法要求:(1) 缺乏培训实例直接影响到损失价值,(2) 损失的变化直接衡量模型性能的损害性能。但在GAN培训中,没有满足要求。这是因为:(1) 发电机的损失没有直接受到培训实例的影响,因为它们不是发电机培训步骤的一部分,而且(2) GAN损失的价值通常不能反映模型的典型性能。为此,我们建议一种影响估计方法,即利用发电机损失梯度的雅各克比对模型性能的损害性能,以跟踪在歧视者培训中缺乏实例的改进对发电机参数的影响,以及(2) 我们建议一种新评价计划,在每次培训实例中,我们根据GAN标准评估评估的有害性能性能,我们根据GAN标准对G值的正确性能估计,我们根据GA值的正确性能测测测测算。我们为G的指数的正确度,我们根据GAN的测测测测测测测测测测测测测结果。我们测了对G值的值的危害性结果。我们测测测测测测测测测测测测了G的值的性结果。