Prior works have found it beneficial to combine provably noise-robust loss functions e.g., mean absolute error (MAE) with standard categorical loss function e.g. cross entropy (CE) to improve their learnability. Here, we propose to use Jensen-Shannon divergence as a noise-robust loss function and show that it interestingly interpolate between CE and MAE with a controllable mixing parameter. Furthermore, we make a crucial observation that CE exhibit lower consistency around noisy data points. Based on this observation, we adopt a generalized version of the Jensen-Shannon divergence for multiple distributions to encourage consistency around data points. Using this loss function, we show state-of-the-art results on both synthetic (CIFAR), and real-world (WebVision) noise with varying noise rates.
翻译:先前的工程发现,将可察觉到的噪音-紫外线损失功能(例如,平均绝对误差(MAE))与标准的绝对损耗功能(例如,跨环对流(CE))结合起来,以提高其学习能力是有益的。在这里,我们提议使用詹森-沙农差异作为噪音-紫外线损失功能,并表明CE和MAE之间以可控混合参数进行有趣的内插。此外,我们提出一个关键意见,即CE在吵闹的数据点周围表现出较低的一致性。基于这一观察,我们采用了一个通用版本的Jensen-Shannon差异,用于多种分布,以鼓励数据点周围的一致性。我们利用这一损失功能,显示了合成(CIFAR)和真实世界(WebVision)噪音的不同率的最新结果。