Allocation strategies improve the efficiency of crowdsourcing by decreasing the work needed to complete individual tasks accurately. However, these algorithms introduce bias by preferentially allocating workers onto easy tasks, leading to sets of completed tasks that are no longer representative of all tasks. This bias challenges inference of problem-wide properties such as typical task difficulty or crowd properties such as worker completion times, important information that goes beyond the crowd responses themselves. Here we study inference about problem properties when using an allocation algorithm to improve crowd efficiency. We introduce Decision-Explicit Probability Sampling (DEPS), a novel method to perform inference of problem properties while accounting for the potential bias introduced by an allocation strategy. Experiments on real and synthetic crowdsourcing data show that DEPS outperforms baseline inference methods while still leveraging the efficiency gains of the allocation method. The ability to perform accurate inference of general properties when using non-representative data allows crowdsourcers to extract more knowledge out of a given crowdsourced dataset.
翻译:分配战略通过减少准确完成个人任务所需的工作来提高众包的效率。然而,这些算法引入了偏好将工人分配到容易的任务上,从而导致完成的任务组不再代表所有任务。这种偏向性的挑战可以推断出问题范围的特性,例如典型的任务困难或人群特性,例如工人完成工作的时间、超越人群反应本身的重要信息等。我们在这里研究在使用分配算法提高人群效率时对问题属性的推论。我们引入了决策-明显概率抽样(DEPS),这是在计算分配战略引入的潜在偏差的同时推断问题属性的一种新颖方法。对实际和合成众包数据的实验表明,DEPS在利用分配方法的效率收益的同时,超越了基线推断方法。在使用非代表性数据时准确推断一般属性的能力使众包外包者能够从特定人群源数据集中获取更多知识。