您的合成数据有多忠实? 用于评估和审计生成模型的抽样计量尺度 (How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models)

Devising domain- and model-agnostic evaluation metrics for generative models is an important and as yet unresolved problem. Most existing metrics, which were tailored solely to the image synthesis setup, exhibit a limited capacity for diagnosing the different modes of failure of generative models across broader application domains. In this paper, we introduce a 3-dimensional evaluation metric, ($\alpha$-Precision, $\beta$-Recall, Authenticity), that characterizes the fidelity, diversity and generalization performance of any generative model in a domain-agnostic fashion. Our metric unifies statistical divergence measures with precision-recall analysis, enabling sample- and distribution-level diagnoses of model fidelity and diversity. We introduce generalization as an additional, independent dimension (to the fidelity-diversity trade-off) that quantifies the extent to which a model copies training data -- a crucial performance indicator when modeling sensitive data with requirements on privacy. The three metric components correspond to (interpretable) probabilistic quantities, and are estimated via sample-level binary classification. The sample-level nature of our metric inspires a novel use case which we call model auditing, wherein we judge the quality of individual samples generated by a (black-box) model, discarding low-quality samples and hence improving the overall model performance in a post-hoc manner.

翻译：为基因模型设计域名和模型名评价指标是一个重要和尚未解决的问题。大多数现有指标是专门为图像合成设置而设计的,在诊断更广泛的应用领域的基因模型不同失败模式方面能力有限。在本文中,我们引入了三维评价指标,即(alpha$-precision, $\beta$-Recall, $\beta$-Regeta-Recall, 真实性),这是任何基因模型在域名方式上的忠实性、多样性和通用性能的特点。我们用精确召回分析,使对模型忠实性和多样性的抽样和分布层次诊断成为统一的统计差异计量标准。我们引入了一般化作为额外、独立的层面(对忠诚多样性交易),以量化模型复制培训数据的程度 -- -- 在模拟具有模型隐私要求的敏感数据时,一个关键的业绩指标 -- -- 三个指标与(互换)比较性、可比较性数量相对,并通过抽样分级分类来估算。我们采用抽样级标准级标准级标准,我们采用了一个测试质量标准级的样本级标准,我们用一个测试标准级标准级的样本,从而检验了我们使用一个新的案例。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/