Label noise is pervasive in real-world datasets, which encodes wrong correlation patterns and impairs the generalization of deep neural networks (DNNs). It is critical to find efficient ways to detect the corrupted patterns. Current methods primarily focus on designing robust training techniques to prevent DNNs from memorizing corrupted patterns. This approach has two outstanding caveats: 1) applying this approach to each individual dataset would often require customized training processes; 2) as long as the model is trained with noisy supervisions, overfitting to corrupted patterns is often hard to avoid, leading to performance drop in detection. In this paper, given good representations, we propose a universally applicable and training-free solution to detect noisy labels. Intuitively, good representations help define ``neighbors'' of each training instance, and closer instances are more likely to share the same clean label. Based on the neighborhood information, we propose two methods: the first one uses ``local voting" via checking the noisy label consensuses of nearby representations. The second one is a ranking-based approach that scores each instance and filters out a guaranteed number of instances that are likely to be corrupted, again using only representations. Given good (but possibly imperfect) representations that are commonly available in practice, we theoretically analyze how they affect the local voting and provide guidelines for tuning neighborhood size. We also prove the worst-case error bound for the ranking-based method. Experiments with both synthetic and real-world label noise demonstrate our training-free solutions are consistently and significantly improving over most of the training-based baselines. Code is available at github.com/UCSC-REAL/SimiRep.
翻译:在真实世界的数据集中,标签的噪音是普遍存在的,真实世界的数据集编码了错误的关联模式,并妨碍深层神经网络(DNNs)的普遍化。对于找到高效的探测腐败模式的方法至关重要。当前的方法主要侧重于设计强健的培训技术,防止DNs对腐败模式进行记忆化。这个方法有两个突出的警告:(1) 将这种方法应用于每个单个数据集往往需要定制化的培训程序;(2) 只要模型经过烦琐的监督,过度适应腐败模式往往难以避免,导致性能下降。在本文中,如果表现良好,我们建议一种可持续适用和无培训的解决方案来探测吵闹的标签。直观地说,好的表述方式有助于确定每个培训实例的“邻居”,而更近距离的信息往往需要定制培训程序;(2) 只要模型经过杂乱的监管,过度适应腐败模式往往难以避免,从而导致检测工作绩效下降。在本文中,我们提出了一种基于等级的方法,每个实例和过滤方法都可持续适用,从而能够明确确定每个培训中的“NNebearbbboral ” 和“Oral” 。我们只能用最可靠的方法来进行精确的排序。