Understanding the strengths and weaknesses of machine learning (ML) algorithms is crucial for determine their scope of application. Here, we introduce the DIverse and GENerative ML Benchmark (DIGEN) - a collection of synthetic datasets for comprehensive, reproducible, and interpretable benchmarking of machine learning algorithms for classification of binary outcomes. The DIGEN resource consists of 40 mathematical functions which map continuous features to discrete endpoints for creating synthetic datasets. These 40 functions were discovered using a heuristic algorithm designed to maximize the diversity of performance among multiple popular machine learning algorithms thus providing a useful test suite for evaluating and comparing new methods. Access to the generative functions facilitates understanding of why a method performs poorly compared to other algorithms thus providing ideas for improvement. The resource with extensive documentation and analyses is open-source and available on GitHub.
翻译:了解机器学习(ML)算法的优缺点对于确定其应用范围至关重要。 在这里, 我们引入了DIverse 和General ML基准(DIGEN)—— 集合成数据集,用于综合、 可复制和可解释的机器学习算法基准,用于二元结果分类。 DIGEN资源由40个数学功能组成, 绘制离散端点的连续特征图, 用于创建合成数据集。 这40个功能是使用一种超常算法发现的, 目的是尽量扩大多种流行机器学习算法的性能多样性, 从而为评价和比较新方法提供一个有用的测试套。 使用基因化功能有助于理解为什么一种方法与其他算法相比表现不佳, 从而提供了改进的想法。 拥有大量文献和分析的资源是公开的, 可在 GitHub 上查阅。