Vendi分数:多样性评价衡量方法,用于机器学习 (The Vendi Score: A Diversity Evaluation Metric for Machine Learning)

from arxiv, The Vendi Score is available as a pip package (https://github.com/vertaix/Vendi-Score) and as part of HuggingFace Evaluate (https://huggingface.co/spaces/Vertaix/vendiscore)

Diversity is an important criterion for many areas of machine learning (ML), including generative modeling and dataset curation. Yet little work has gone into understanding, formalizing, and measuring diversity in ML. In this paper, we address the diversity evaluation problem by proposing the Vendi Score, which connects and extends ideas from ecology and quantum statistical mechanics to ML. The Vendi Score is defined as the exponential of the Shannon entropy of the eigenvalues of a similarity matrix. This matrix is induced by a user-defined similarity function applied to the sample to be evaluated for diversity. In taking a similarity function as input, the Vendi Score enables its user to specify any desired form of diversity. Importantly, unlike many existing metrics in ML, the Vendi Score doesn't require a reference dataset or distribution over samples or labels, it is therefore general and applicable to any generative model, decoding algorithm, and dataset from any domain where similarity can be defined. We showcased the Vendi Score on molecular generative modeling, a domain where diversity plays an important role in enabling the discovery of novel molecules. We found that the Vendi Score addresses shortcomings of the current diversity metric of choice in that domain. We also applied the Vendi Score to generative models of images and decoding algorithms of text and found it confirms known results about diversity in those domains. Furthermore, we used the Vendi Score to measure mode collapse, a known limitation of generative adversarial networks (GANs). In particular, the Vendi Score revealed that even GANs that capture all the modes of a labeled dataset can be less diverse than the original dataset. Finally, the interpretability of the Vendi Score allowed us to diagnose several benchmark ML datasets for diversity, opening the door for diversity-informed data augmentation.

翻译：多样性是机器学习(ML)许多领域的重要标准, 包括基因模型和数据校正。然而,在理解、正规化和衡量ML的多样性方面几乎没有做多少工作。在本论文中, 我们通过提出Vendi评分来解决多样性评价问题, Vendi评分将生态学和量数统计学机学的想法连接起来并扩展到ML。 Vendi评分被定义为一个类似矩阵的隐性模型的香农变异值的指数。这个矩阵是由用于评估多样性的样本的用户定义的直线性相似性功能所引发的。在将类似功能作为输入的类似功能中, Vendi 评分使用户能够指定任何理想的多样化。 Vendi 评分让其用户能够指定任何理想形式的多样化。与许多现有的衡量标准不同的是, Vendi评分并不需要参考数据集,或者将样本或标签的分布于任何可以定义相似性模型中。我们允许在分子变异性评分解模型上显示Vendial dride 。我们发现, 也发现, Vendredial dal dal dal dal deal laveal dal dal dald dal dal laveald dald dald the dald the dald daldald dald dald dald daldaldald dald dald dald dald dald the sald sald sald ledaldaldaldald sald saldaldald led ledaldald.