Despite its successes in various machine learning and data science tasks, crowdsourcing can be susceptible to attacks from dedicated adversaries. This work investigates the effects of adversaries on crowdsourced classification, under the popular Dawid and Skene model. The adversaries are allowed to deviate arbitrarily from the considered crowdsourcing model, and may potentially cooperate. To address this scenario, we develop an approach that leverages the structure of second-order moments of annotator responses, to identify large numbers of adversaries, and mitigate their impact on the crowdsourcing task. The potential of the proposed approach is empirically demonstrated on synthetic and real crowdsourcing datasets.
翻译:尽管在各种机器学习和数据科学任务中取得了成功,但众包仍有可能受到专门对手的攻击。这项工作调查了对手对众包分类的影响,根据流行的Dawid和Skene模式。允许对手任意偏离考虑的众包模式,并可能予以合作。为了应对这种情况,我们制定了一种办法,利用标注者反应的第二阶时段结构,确定大量对手,并减轻其对众包任务的影响。拟议办法的潜力在合成和真实的众包数据集上得到了经验的证明。