Deep learning-based histopathology image classification is a key technique to help physicians in improving the accuracy and promptness of cancer diagnosis. However, the noisy labels are often inevitable in the complex manual annotation process, and thus mislead the training of the classification model. In this work, we introduce a novel hard sample aware noise robust learning method for histopathology image classification. To distinguish the informative hard samples from the harmful noisy ones, we build an easy/hard/noisy (EHN) detection model by using the sample training history. Then we integrate the EHN into a self-training architecture to lower the noise rate through gradually label correction. With the obtained almost clean dataset, we further propose a noise suppressing and hard enhancing (NSHE) scheme to train the noise robust model. Compared with the previous works, our method can save more clean samples and can be directly applied to the real-world noisy dataset scenario without using a clean subset. Experimental results demonstrate that the proposed scheme outperforms the current state-of-the-art methods in both the synthetic and real-world noisy datasets. The source code and data are available at https://github.com/bupt-ai-cz/HSA-NRL/.
翻译:深入学习的病理病理学图像分类是帮助医生提高癌症诊断准确性和及时性的关键方法之一。然而,在复杂的人工批注过程中,噪音标签往往是不可避免的,因此对分类模型的培训也会产生误导。在这项工作中,我们引入了一种新型的硬抽样、有噪音意识的噪音强力学习方法,用于对病理病理学图像进行分类。为了将信息丰富的硬样本与有害噪音样本区分开来,我们利用抽样培训历史建立了一个简单/硬/ noisy(EHN)的检测模型。然后我们将EHN纳入一个自我培训架构,通过逐步校正标签来降低噪音率。有了几乎清洁的数据集,我们进一步提议了一个抑制噪音和大力加强(NSHE)计划来训练噪音强健健模型。与以前的工作相比,我们的方法可以节省更干净的样本,并直接应用于真实世界的噪音数据集假设,而不使用干净的子集。实验结果表明,拟议的计划在合成和真实世界的保热数据集中都不符合目前的状态方法。源码和数据可在 http://Lubs/RIS/RUp-coms。