We develop a measure for evaluating the performance of generative networks given two sets of images. A popular performance measure currently used to do this is the Fr\'echet Inception Distance (FID). However, FID assumes that images featurized using the penultimate layer of Inception-v3 follow a Gaussian distribution. This assumption allows FID to be easily computed, since FID uses the 2-Wasserstein distance of two Gaussian distributions fitted to the featurized images. However, we show that Inception-v3 features of the ImageNet dataset are not Gaussian; in particular, each marginal is not Gaussian. To remedy this problem, we model the featurized images using Gaussian mixture models (GMMs) and compute the 2-Wasserstein distance restricted to GMMs. We define a performance measure, which we call WaM, on two sets of images by using Inception-v3 (or another classifier) to featurize the images, estimate two GMMs, and use the restricted 2-Wasserstein distance to compare the GMMs. We experimentally show the advantages of WaM over FID, including how FID is more sensitive than WaM to image perturbations. By modelling the non-Gaussian features obtained from Inception-v3 as GMMs and using a GMM metric, we can more accurately evaluate generative network performance.
翻译:我们为评估基因网络的性能开发了一套测量方法,提供了两套图像。目前用于此目的的流行性性能量度是Fr\'echet Inpeption 距离(FID)。然而,FID认为,使用感知-v3倒数倒数第二层成型的图像采用Gaussian-v3的分布方式。这一假设使FID易于计算出来,因为FID使用与Faturizized图像配配配配的两套Gaussian分布方式的2-Wasserstein距离。然而,我们显示,图像网络的感知-V3特征不是高斯;特别是,每个边缘不是高斯丹。为了解决这个问题,我们用Gaussian混合物模型(GMMM)来模拟FAforthizizization-Wasserstein图像。我们用Incepticion-v3 (或另一个分类)来对两套图像进行粉化,估计两个GMMM(GMM),并且使用限制的2-WassersteinMM(Wiseral-M)网络比GMMM的性模型更能比GM的不高。我们用SDM(WBM)用来用来比较GM)的不比GMMMMM的图像的图像的模型展示。我们多少的不透明性能展示。我们用一个不高。