The spectacular success of deep generative models calls for quantitative tools to measure their statistical performance. Divergence frontiers have recently been proposed as an evaluation framework for generative models, due to their ability to measure the quality-diversity trade-off inherent to deep generative modeling. However, the statistical behavior of divergence frontiers estimated from data remains unknown to this day. In this paper, we establish non-asymptotic bounds on the sample complexity of the plug-in estimator of divergence frontiers. Along the way, we introduce a novel integral summary of divergence frontiers. We derive the corresponding non-asymptotic bounds and discuss the choice of the quantization level by balancing the two types of approximation errors arisen from its computation. We also augment the divergence frontier framework by investigating the statistical performance of smoothed distribution estimators such as the Good-Turing estimator. We illustrate the theoretical results with numerical examples from natural language processing and computer vision.
翻译:深层基因模型的巨大成功要求有定量工具来衡量其统计性能。最近,由于能够衡量深层基因模型所固有的质量多样性权衡,提出了差异边界作为基因模型的评价框架。然而,根据数据估计的差异边界的统计行为至今仍不为人所知。在本文中,我们为差异边界的插座测量器的抽样复杂性确定了非抽取界限。与此同时,我们引入了对差异边界的新型综合汇总。我们得出相应的非抽取界限,并通过平衡计算中产生的两种近似差错来讨论量化水平的选择。我们还通过调查顺利分布估计器的统计业绩,如良好图解标尺,来扩大差异边界框架。我们用自然语言处理和计算机视觉的数字实例来说明理论结果。