Modern metrics for generative learning like Fr\'echet Inception Distance (FID) and DINOv2-Fr\'echet Distance (FD-DINOv2) demonstrate impressive performance. However, they suffer from various shortcomings, like a bias towards specific generators and datasets. To address this problem, we propose the Fr\'echet Wavelet Distance (FWD) as a domain-agnostic metric based on the Wavelet Packet Transform ($W_p$). FWD provides a sight across a broad spectrum of frequencies in images with a high resolution, preserving both spatial and textural aspects. Specifically, we use $W_p$ to project generated and real images to the packet coefficient space. We then compute the Fr\'echet distance with the resultant coefficients to evaluate the quality of a generator. This metric is general-purpose and dataset-domain agnostic, as it does not rely on any pre-trained network, while being more interpretable due to its ability to compute Fr\'echet distance per packet, enhancing transparency. We conclude with an extensive evaluation of a wide variety of generators across various datasets that the proposed FWD can generalize and improve robustness to domain shifts and various corruptions compared to other metrics.
翻译:暂无翻译