Data is a key component of modern machine learning, but statistics for assessing data label quality remain sparse in literature. Here, we introduce DiPietro-Hazari Kappa, a novel statistical metric for assessing the quality of suggested dataset labels in the context of human annotation. Rooted in the classical Fleiss's Kappa measure of inter-annotator agreement, the DiPietro-Hazari Kappa quantifies the the empirical annotator agreement differential that was attained above random chance. We offer a thorough theoretical examination of Fleiss's Kappa before turning to our derivation of DiPietro-Hazari Kappa. Finally, we conclude with a matrix formulation and set of procedural instructions for easy computational implementation.
翻译:数据是现代机器学习的一个关键组成部分,但评估数据标签质量的统计数据在文献中仍然很少。在这里,我们介绍DiPitro-Hazari Kappa,这是评估人类笔记中建议数据集标签质量的新型统计指标,根植于经典Fleiss的“卡普帕”内部协议标准,DiPitro-Hazari Kappa量化了经验性说明协议差分,该差分是在随机机率上取得的。我们从理论角度对Fleiss的“卡普帕”进行了彻底的检查,然后转向“DiPitro-Hazari Kappa”的衍生。最后,我们用一个矩阵和一套程序指南来完成易于计算的执行。