Quantification represents the problem of predicting class distributions in a dataset. It also represents a growing research field in supervised machine learning, for which a large variety of different algorithms has been proposed in recent years. However, a comprehensive empirical comparison of quantification methods that supports algorithm selection is not available yet. In this work, we close this research gap by conducting a thorough empirical performance comparison of 24 different quantification methods on 40 data sets, considering binary as well as multiclass quantification settings. We observe that no single algorithm generally outperforms all competitors, but identify a group of methods including the Median Sweep and the DyS framework that performs best in the binary setting. We also find that tuning the underlying classifiers has in most cases only a limited impact on the quantification performance. For the multiclass setting, we observe that a different, broad group of algorithms yields good performance, including the Generalized Probabilistic Adjusted Count, the readme method, the energy distance minimization method, the EM algorithm for quantification, and Friedman's method. More generally, we find that the performance on multiclass quantification is inferior to the results obtained in the binary setting. Our results can guide practitioners who intend to apply quantification algorithms and help researchers to identify opportunities for future research.
翻译:量化代表了在数据集中预测类分配的问题。 它也代表了监督机器学习中越来越多的研究领域, 近些年来提出了各种不同的算法。 但是, 还没有对支持算法选择的量化方法进行全面的实证比较。 在这项工作中, 我们通过对40个数据集中的24种不同的量化方法进行彻底的经验性业绩比较, 同时考虑到二进制和多级量化设置, 来弥补这一研究差距。 我们观察到, 没有一种单一的算法通常优于所有竞争者, 而是确定一组方法, 包括中位扫荡和在二进制环境中表现最佳的DyS 框架。 我们还发现, 在多数情况下, 调整基础分类师对量化绩效的影响有限。 对于多级设置, 我们观察到, 不同的、 广泛的算法组合产生良好的性能, 包括通用的概率调整计数、 读法、 能源距离最小化方法、 量化的 EM 算法和弗里德曼 方法。 更一般地说, 我们发现, 多级量化方法的性能劣于未来算法研究结果的实践者。 我们发现, 能够将未来算算算算算算结果。