Crowdsourcing platforms are often used to collect datasets for training deep neural networks, despite higher levels of inaccurate labeling compared to expert labeling. There are two common strategies to manage the impact of this noise, the first involves aggregating redundant annotations, but comes at the expense of labeling substantially fewer examples. Secondly, prior works have also considered using the entire annotation budget to label as many examples as possible and subsequently apply denoising algorithms to implicitly clean up the dataset. We propose an approach which instead reserves a fraction of annotations to explicitly relabel highly probable labeling errors. In particular, we allocate a large portion of the labeling budget to form an initial dataset used to train a model. This model is then used to identify specific examples that appear most likely to be incorrect, which we spend the remaining budget to relabel. Experiments across three model variations and four natural language processing tasks show our approach outperforms both label aggregation and advanced denoising methods designed to handle noisy labels when allocated the same annotation budget.
翻译:尽管与专家标签相比,深度神经网络的标签不准确程度较高,但众包平台往往被用来收集数据集,用于培训深层神经网络,尽管与专家标签相比标签不准确程度较高。管理这种噪音的影响有两种共同战略,第一个战略是汇总多余的注释,但以忽略标签少得多的例子为代价。第二,以前的工作还考虑使用整个注解预算,尽可能多地标出实例,并随后采用拆分算法来暗中清理数据集。我们提议了一种方法,即保留部分注解,以明确重新标出非常可能的标签错误。特别是,我们分配了很大一部分标签预算来组成用于培训模型的初始数据集。然后,我们用这个模型来找出似乎最有可能不正确的具体例子,我们用其余的预算重新标签。在三个模型变异样和四个自然语言处理任务方面的实验表明,我们的方法超过了标签汇总和在分配同样的注注时处理噪音标签的先进解算法。