The labels used to train machine learning (ML) models are of paramount importance. Typically for ML classification tasks, datasets contain hard labels, yet learning using soft labels has been shown to yield benefits for model generalization, robustness, and calibration. Earlier work found success in forming soft labels from multiple annotators' hard labels; however, this approach may not converge to the best labels and necessitates many annotators, which can be expensive and inefficient. We focus on efficiently eliciting soft labels from individual annotators. We collect and release a dataset of soft labels (which we call CIFAR-10S) over the CIFAR-10 test set via a crowdsourcing study (N=248). We demonstrate that learning with our labels achieves comparable model performance to prior approaches while requiring far fewer annotators -- albeit with significant temporal costs per elicitation. Our elicitation methodology therefore shows nuanced promise in enabling practitioners to enjoy the benefits of improved model performance and reliability with fewer annotators, and serves as a guide for future dataset curators on the benefits of leveraging richer information, such as categorical uncertainty, from individual annotators.
翻译:用于培训机器学习(ML)模型的标签至关重要。 对于 ML 分类任务来说, 数据集通常包含硬标签, 但使用软标签的学习却证明对模型的概括性、 稳健性和校准性有好处。 早期的工作发现,从多个批注员的硬标签中形成软标签是成功的; 但是, 这种方法可能无法与最佳标签相融合, 需要许多批注员, 这可能成本高、 效率低。 我们注重从个别批注员那里有效获取软标签。 我们通过众包研究( N=248), 收集和发布一个软标签数据集( 我们称之为CIFAR- 10S), 用于对 CIRA- 10 测试集的软标签( 我们称之为CIFAR- 10S ) 。 我们证明, 与我们的标签的学习能够与先前方法相似的模型性能, 而要求的批注员要少得多 -- -- 尽管时间成本很高。 因此, 我们的引论方法显示了细微的许诺, 使从操作员能够享受改进模型性能和可靠性的好处,, 用较少的批注员,, 并用作未来数据标定的指南, 如何利用个人更丰富信息的优点。