Crowdsourcing has emerged as an effective platform to label a large volume of data in a cost- and time-efficient manner. Most previous works have focused on designing an efficient algorithm to recover only the ground-truth labels of the data. In this paper, we consider multi-choice crowdsourced labeling with the goal of recovering not only the ground truth but also the most confusing answer and the confusion probability. The most confusing answer provides useful information about the task by revealing the most plausible answer other than the ground truth and how plausible it is. To theoretically analyze such scenarios, we propose a model where there are top-two plausible answers for each task, distinguished from the rest of choices. Task difficulty is quantified by the confusion probability between the top two, and worker reliability is quantified by the probability of giving an answer among the top two. Under this model, we propose a two-stage inference algorithm to infer the top-two answers as well as the confusion probability. We show that our algorithm achieves the minimax optimal convergence rate. We conduct both synthetic and real-data experiments and demonstrate that our algorithm outperforms other recent algorithms. We also show the applicability of our algorithms in inferring the difficulty of tasks and training neural networks with the soft labels composed of the top-two most plausible classes.
翻译:众包已经成为一个有效的平台,以成本和时间高效的方式给大量数据贴上成本和时间效率高的标签。 以往的多数工作都侧重于设计一种有效的算法, 以便只恢复数据的地面真实标签。 在本文中, 我们考虑多选择众包标签, 目的不仅是要恢复地面真相, 而且要找到最令人困惑的答案和混乱的概率。 最令人困惑的答案通过揭示地面真相以外的最可信的答案, 提供了有关这项任务的有用信息 。 在理论上分析这些假设时, 我们提出了一个模型, 其中每个任务都有最上两个可信的答案, 区别于其余的选择。 任务难度通过头两个任务之间的混乱概率来量化, 而工人的可靠性则通过在头两个任务之间给出答案的可能性来量化。 在这个模型中, 我们提出一个两阶段的推论算法, 来推导出头两个答案以及混乱的可能性。 我们显示我们的算法达到了最小型的最佳趋同率。 我们进行合成和真实的数据实验, 并表明我们的算法比其他最近的算法要优于最近算法的软性。 我们还用最难的等级的算算算出我们最难的等级的算算。