Annotators exhibit disagreement during data labeling, which can be termed as annotator label uncertainty. Annotator label uncertainty manifests in variations of labeling quality. Training with a single low-quality annotation per sample induces model reliability degradations. In this work, we first examine the effects of annotator label uncertainty in terms of the model's generalizability and prediction uncertainty. We observe that the model's generalizability and prediction uncertainty degrade with the presence of low-quality noisy labels. Meanwhile, our evaluation of existing uncertainty estimation algorithms indicates their incapability in response to annotator label uncertainty. To mitigate performance degradation, prior methods show that training models with labels collected from multiple independent annotators can enhance generalizability. However, they require massive annotations. Hence, we introduce a novel perceptual quality-based model training framework to objectively generate multiple labels for model training to enhance reliability, while avoiding massive annotations. Specifically, we first select a subset of samples with low perceptual quality scores ranked by statistical regularities of visual signals. We then assign de-aggregated labels to each sample in this subset to obtain a training set with multiple labels. Our experiments and analysis demonstrate that training with the proposed framework alleviates the degradation of generalizability and prediction uncertainty caused by annotator label uncertainty.
翻译:暂无翻译