Whole slide image (WSI) classification often relies on deep weakly supervised multiple instance learning (MIL) methods to handle gigapixel resolution images and slide-level labels. Yet the decent performance of deep learning comes from harnessing massive datasets and diverse samples, urging the need for efficient training pipelines for scaling to large datasets and data augmentation techniques for diversifying samples. However, current MIL-based WSI classification pipelines are memory-expensive and computation-inefficient since they usually assemble tens of thousands of patches as bags for computation. On the other hand, despite their popularity in other tasks, data augmentations are unexplored for WSI MIL frameworks. To address them, we propose ReMix, a general and efficient framework for MIL based WSI classification. It comprises two steps: reduce and mix. First, it reduces the number of instances in WSI bags by substituting instances with instance prototypes, i.e., patch cluster centroids. Then, we propose a ``Mix-the-bag'' augmentation that contains four online, stochastic and flexible latent space augmentations. It brings diverse and reliable class-identity-preserving semantic changes in the latent space while enforcing semantic-perturbation invariance. We evaluate ReMix on two public datasets with two state-of-the-art MIL methods. In our experiments, consistent improvements in precision, accuracy, and recall have been achieved but with orders of magnitude reduced training time and memory consumption, demonstrating ReMix's effectiveness and efficiency. Code is available.
翻译:整个幻灯片图像( WSI ) 分类往往依赖于低劣监管的多重实例学习方法( MIL ) 。 然而, 深层次学习的体面表现来自利用大型数据集和各种样本,敦促需要高效的培训管道,以推广大型数据集和数据增强技术,使样本多样化。 但是, 以MIL 为基础的 WSI 分类管道通常会聚集数以万计的补丁,作为计算包。 另一方面, 尽管它们在其他任务中很受欢迎, 数据增强尚未为 WSI MIL 框架开发。 为了解决这些问题, 我们提议使用ReMix, 一个基于 MIL 分类的一般有效框架。 它包含两个步骤: 减少和混合。 首先, 它通过替换实例原型, 即, 补丁基质类缩缩缩缩缩缩缩。 然后, 我们提出一个“Mix- the bag” 增容, 包含四个在线、 试查和灵活的空间放大框架。 为了解决这些问题, 我们提出“reix- mainal latical” 精确度的精确度和精确度, 我们的精确度的精确度在公共数据中可以使用。