Unbiased SGG has achieved significant progress over recent years. However, almost all existing SGG models have overlooked the ground-truth annotation qualities of prevailing SGG datasets, i.e., they always assume: 1) all the manually annotated positive samples are equally correct; 2) all the un-annotated negative samples are absolutely background. In this paper, we argue that both assumptions are inapplicable to SGG: there are numerous "noisy" groundtruth predicate labels that break these two assumptions, and these noisy samples actually harm the training of unbiased SGG models. To this end, we propose a novel model-agnostic NoIsy label CorrEction strategy for SGG: NICE. NICE can not only detect noisy samples but also reassign more high-quality predicate labels to them. After the NICE training, we can obtain a cleaner version of SGG dataset for model training. Specifically, NICE consists of three components: negative Noisy Sample Detection (Neg-NSD), positive NSD (Pos-NSD), and Noisy Sample Correction (NSC). Firstly, in Neg-NSD, we formulate this task as an out-of-distribution detection problem, and assign pseudo labels to all detected noisy negative samples. Then, in Pos-NSD, we use a clustering-based algorithm to divide all positive samples into multiple sets, and treat the samples in the noisiest set as noisy positive samples. Lastly, in NSC, we use a simple but effective weighted KNN to reassign new predicate labels to noisy positive samples. Extensive results on different backbones and tasks have attested to the effectiveness and generalization abilities of each component of NICE.
翻译:然而,几乎所有现有的SGG模型都忽略了当前SGG数据集的地面真实性说明质量,也就是说,它们总是假设:(1) 所有手动附加说明的正面样本都同样正确;(2) 所有未加说明的负面样本都是绝对的背景。 在本文中,我们争辩说,这两种假设都不适用于SGG: 有许多“ noisy” 地真真真假标签打破了这两个假设, 而这些杂乱的样本实际上损害了对公正SGG模型的培训。 为此,我们提议为SGG提出一个新的模型性鼻假标签Correct 战略: NICE 。 NICE 不仅可以检测噪音的样本,还可以重新为它们指定更高质量的上游标签。 在NICE 培训之后,我们可以得到一个更干净的 SGG数据集版本。 具体地说, NICE包括三个组成部分: 阴性血液样本检测( Neg-NSD), 正性 NSD (OW-NSD), 以及 Noismalalalalal Realation (NS) 校正的样本校验(NS-NC) ) 。首先, 将所有检测结果都用作。