Labeling data is an important step in the supervised machine learning lifecycle. It is a laborious human activity comprised of repeated decision making: the human labeler decides which of several potential labels to apply to each example. Prior work has shown that providing AI assistance can improve the accuracy of binary decision tasks. However, the role of AI assistance in more complex data-labeling scenarios with a larger set of labels has not yet been explored. We designed an AI labeling assistant that uses a semi-supervised learning algorithm to predict the most probable labels for each example. We leverage these predictions to provide assistance in two ways: (i) providing a label recommendation and (ii) reducing the labeler's decision space by focusing their attention on only the most probable labels. We conducted a user study (n=54) to evaluate an AI-assisted interface for data labeling in this context. Our results highlight that the AI assistance improves both labeler accuracy and speed, especially when the labeler finds the correct label in the reduced label space. We discuss findings related to the presentation of AI assistance and design implications for intelligent labeling interfaces.
翻译:标签数据是受监督的机器学习生命周期中的一个重要步骤。这是一个由反复决策构成的艰巨的人类活动:人类标签员决定对每个例子适用几个潜在标签中的哪一个。先前的工作表明,提供AI协助可以提高二进制决定任务的准确性。然而,尚未探讨AI协助在更为复杂的数据标签假设中发挥作用,并配有一套更大的标签。我们设计了一个AI标签助理,使用半监督的学习算法来预测每个例子最可能的标签。我们利用这些预测以两种方式提供援助:(一) 提供标签建议,(二) 减少标签员的决定空间,只关注最可能的标签。我们开展了一项用户研究(n=54),以评价这方面数据标签的人工辅助界面。我们的结果突出表明,AI协助提高了标签的准确性和速度,特别是在标签员在减少的标签空间中找到正确的标签时。我们讨论了与展示AI协助和设计对智能标签接口的影响有关的调查结果。