Positive-Unlabelled (PU) learning is a growing area of machine learning that aims to learn classifiers from data consisting of labelled positive and unlabelled instances. Whilst much work has been done proposing methods for PU learning, little has been written on the subject of evaluating these methods. Many popular standard classification metrics cannot be precisely calculated due to the absence of fully labelled data, so alternative approaches must be taken. This short commentary paper critically reviews the main PU learning evaluation approaches and the choice of predictive accuracy measures in 51 articles proposing PU classifiers and provides practical recommendations for improvements in this area.
翻译:积极的(PU)学习是一个不断增长的机械学习领域,目的是从由贴有正面和未贴标签的例子组成的数据中学习分类人员。虽然已经做了大量工作,提出了采用PU学习的方法,但是在评价这些方法的问题上却写得很少。由于缺少贴有完整标签的数据,许多流行的标准分类指标无法精确计算,因此必须采取其他办法。这份简短的评论文件批判性地审查了主要的PU学习评价方法以及51个条款中预测准确性措施的选择,其中提出了PU分类方法,并为改进这一领域提出了切实可行的建议。