This work is an update of a previous paper on the same topic published a few years ago. With the dramatic progress in generative modeling, a suite of new quantitative and qualitative techniques to evaluate models has emerged. Although some measures such as Inception Score, Fr\'echet Inception Distance, Precision-Recall, and Perceptual Path Length are relatively more popular, GAN evaluation is not a settled issue and there is still room for improvement. For example, in addition to quality and diversity of synthesized images, generative models should be evaluated in terms of bias and fairness. I describe new dimensions that are becoming important in assessing models, and discuss the connection between GAN evaluation and deepfakes.
翻译:这项工作是对几年前发表的关于同一主题的前一份文件的更新。随着基因模型的突变,一套评估模型的新的定量和定性技术已经出现。尽管一些措施,如“感知分数 ” 、 Fr\'echet 感知距离、 精度-回响和感知路径长度等措施相对比较受欢迎,但GAN评价尚未解决,仍有改进的余地。例如,除了合成图像的质量和多样性外,基因模型还应从偏向和公平的角度加以评价。我描述了在评估模型方面变得重要的新层面,并讨论了GAN评价与深假相之间的联系。