Solving complex classification tasks using deep neural networks typically requires large amounts of annotated data. However, corresponding class labels are noisy when provided by error-prone annotators, e.g., crowd workers. Training standard deep neural networks leads to subpar performances in such multi-annotator supervised learning settings. We address this issue by presenting a probabilistic training framework named multi-annotator deep learning (MaDL). A ground truth and an annotator performance model are jointly trained in an end-to-end learning approach. The ground truth model learns to predict instances' true class labels, while the annotator performance model infers probabilistic estimates of annotators' performances. A modular network architecture enables us to make varying assumptions regarding annotators' performances, e.g., an optional class or instance dependency. Further, we learn annotator embeddings to estimate annotators' densities within a latent space as proxies of their potentially correlated annotations. Together with a weighted loss function, we improve the learning from correlated annotation patterns. In a comprehensive evaluation, we examine three research questions about multi-annotator supervised learning. Our findings indicate MaDL's state-of-the-art performance and robustness against many correlated, spamming annotators.
翻译:使用深度神经网络解决复杂的分类任务通常需要大量注释数据。然而,当由容易出错的注释者(例如众包工人)提供相应的类标签时,相应的类标签是噪声的。在这种多注释者监督学习设置中,训练标准深度神经网络会导致次优的性能。我们通过提出一个名为多注释者深度学习(MaDL)的概率训练框架来解决这个问题。在端到端的学习方法中,一个地面真相模型和一个注释者性能模型被共同训练。地面真相模型学习预测实例的真实类标签,而注释者性能模型推断注释者性能的概率估计。一个模块化的网络架构使我们能够在注释者性能方面进行不同的假设,例如,一个可选的类或实例的依赖关系。此外,我们学习注释者嵌入,以估计潜在空间内注释者密度作为其潜在相关标注的代理。与加权损失函数一起,我们提高了从相关的注释模式中学习的能力。在全面的评估中,我们研究了多注释者监督学习的三个问题。我们的研究结果表明MaDL具有最先进的性能,并且对许多相关的垃圾注释者具有鲁棒性。