Crowdsourcing system has emerged as an effective platform to label data with relatively low cost by using non-expert workers. However, inferring correct labels from multiple noisy answers on data has been a challenging problem, since the quality of answers varies widely across tasks and workers. Many previous works have assumed a simple model where the order of workers in terms of their reliabilities is fixed across tasks, and focused on estimating the worker reliabilities to aggregate answers with different weights. We propose a highly general $d$-type worker-task specialization model in which the reliability of each worker can change depending on the type of a given task, where the number $d$ of types can scale in the number of tasks. In this model, we characterize the optimal sample complexity to correctly infer labels with any given recovery accuracy, and propose an inference algorithm achieving the order-wise optimal bound. We conduct experiments both on synthetic and real-world datasets, and show that our algorithm outperforms the existing algorithms developed based on strict model assumptions.
翻译:众包系统已经成为一个有效的平台,通过使用非专家工人,以相对较低的成本对数据进行标签。然而,从对数据进行多次吵闹的回答中推断正确的标签是一个具有挑战性的问题,因为答案的质量因任务和工人的不同而大不相同。许多以前的工作都采用了一个简单的模型,根据这种模型,工人的共性顺序在各项任务之间是固定的,重点是估计工人的共性,以便用不同重量来汇总答案。我们提出了一个非常一般的美元类型的工人-任务专业化模型,根据特定任务的类型,每个工人的可靠性都可以改变。在这个模型中,每类的美元数量可以按任务的数量来计算。在这个模型中,我们把最精细的样本复杂性描述为正确推算标签,并提议一种精准的推算法,实现符合定序的最佳约束。我们在合成和真实世界的数据集上进行实验,并表明我们的算法优于基于严格模型假设制定的现有算法。