Using only global annotations such as the image class labels, weakly-supervised learning methods allow CNN classifiers to jointly classify an image, and yield the regions of interest associated with the predicted class. However, without any guidance at the pixel level, such methods may yield inaccurate regions. This problem is known to be more challenging with histology images than with natural ones, since objects are less salient, structures have more variations, and foreground and background regions have stronger similarities. Therefore, methods in computer vision literature for visual interpretation of CNNs may not directly apply. In this work, we propose a simple yet efficient method based on a composite loss function that leverages information from the fully negative samples. Our new loss function contains two complementary terms: the first exploits positive evidence collected from the CNN classifier, while the second leverages the fully negative samples from the training dataset. In particular, we equip a pre-trained classifier with a decoder that allows refining the regions of interest. The same classifier is exploited to collect both the positive and negative evidence at the pixel level to train the decoder. This enables to take advantages of the fully negative samples that occurs naturally in the data, without any additional supervision signals and using only the image class as supervision. Compared to several recent related methods, over the public benchmark GlaS for colon cancer and a Camelyon16 patch-based benchmark for breast cancer using three different backbones, we show the substantial improvements introduced by our method. Our results shows the benefits of using both negative and positive evidence, ie, the one obtained from a classifier and the one naturally available in datasets. We provide an ablation study of both terms. Our code is publicly available.
翻译:仅使用图像类标签等全球说明, 低监管的学习方法使CNN分类器能够联合对图像进行分类, 并生成与预测类相关的相关区域。 但是, 在像素层面没有任何指导的情况下, 此类方法可能会产生不准确的区域。 众所周知, 这个问题在组织图像上比自然图象上更具挑战性, 因为对象不那么突出, 结构差异更大, 前景和背景区域有更强烈的相似性。 因此, 计算机视觉文献对CNN的直观解释方法可能不会直接应用。 在这项工作中, 我们提议基于综合损失函数的简单而有效的方法, 来利用完全负面的证据。 我们的新损失函数包含两个补充词: 第一次利用从CNN分类中收集的积极证据, 而第二个则利用培训数据集中完全负面的样本。 特别是, 我们用一个经过预先训练的分类分析的分解码来精细化区域。 同一分类法用于收集在像素级一级获得的正反面和负面的证据来训练解析器 。 我们的直径的直径只能用一个直径直径直路路路路路路路路路路路路路比, 。