Rank-based statistical metrics, such as the invariant statistical loss (ISL), have recently emerged as robust and practically effective tools for training implicit generative models. In this work, we introduce dual-ISL, a novel likelihood-free objective for training implicit generative models that interchanges the roles of the target and model distributions in the ISL framework, yielding a convex optimization problem in the space of model densities. We prove that the resulting rank-based discrepancy $d_K$ is i) continuous under weak convergence and with respect to the $L^1$ norm, and ii) convex in its first argument-properties not shared by classical divergences such as KL or Wasserstein distances. Building on this, we develop a theoretical framework that interprets $d_K$ as an $L^2$-projection of the density ratio $q = p/\tilde p$ onto a Bernstein polynomial basis, from which we derive exact bounds on the truncation error, precise convergence rates, and a closed-form expression for the truncated density approximation. We further extend our analysis to the multivariate setting via random one-dimensional projections, defining a sliced dual-ISL divergence that retains both convexity and continuity. We empirically show that these theoretical advantages translate into practical ones. Specifically, across several benchmarks dual-ISL converges more rapidly, delivers markedly smoother and more stable training, and more effectively prevents mode collapse than classical ISL and other leading implicit generative methods-while also providing an explicit density approximation.
翻译:基于排序的统计度量,如不变统计损失(ISL),近年来已成为训练隐式生成模型的鲁棒且实际有效的工具。本文提出双ISL(dual-ISL),一种用于训练隐式生成模型的新型无似然目标函数,通过在ISL框架中互换目标分布与模型分布的角色,在模型密度空间中构建凸优化问题。我们证明由此产生的基于排序的差异度量$d_K$具有以下性质:i)在弱收敛及$L^1$范数意义下连续;ii)关于第一个参数具有凸性——这些性质是KL散度或Wasserstein距离等经典散度所不具备的。在此基础上,我们建立理论框架,将$d_K$解释为密度比$q = p/\tilde p$在伯恩斯坦多项式基上的$L^2$投影,由此推导出截断误差的精确界、收敛速率的具体表达式以及截断密度逼近的闭式解。我们进一步通过随机一维投影将分析拓展至多元场景,定义切片双ISL散度,该散度同时保持凸性与连续性。实证研究表明,这些理论优势在实践中得以体现:在多个基准测试中,双ISL相较于经典ISL及其他主流隐式生成方法,不仅收敛速度更快、训练过程显著更平滑稳定、更有效防止模式坍塌,同时还能提供显式的密度逼近。