In computational pathology, weak supervision has become the standard for deep learning due to the gigapixel scale of WSIs and the scarcity of pixel-level annotations, with Multiple Instance Learning (MIL) established as the principal framework for slide-level model training. In this paper, we introduce a novel setting for MIL methods, inspired by proceedings in Neural Partial Differential Equation (PDE) Solvers. Instead of relying on complex attention-based aggregation, we propose an efficient, aggregator-agnostic framework that removes the complexity of correlation learning from the MIL aggregator. CAPRMIL produces rich context-aware patch embeddings that promote effective correlation learning on downstream tasks. By projecting patch features -- extracted using a frozen patch encoder -- into a small set of global context/morphology-aware tokens and utilizing multi-head self-attention, CAPRMIL injects global context with linear computational complexity with respect to the bag size. Paired with a simple Mean MIL aggregator, CAPRMIL matches state-of-the-art slide-level performance across multiple public pathology benchmarks, while reducing the total number of trainable parameters by 48%-92.8% versus SOTA MILs, lowering FLOPs during inference by 52%-99%, and ranking among the best models on GPU memory efficiency and training time. Our results indicate that learning rich, context-aware instance representations before aggregation is an effective and scalable alternative to complex pooling for whole-slide analysis. Our code is available at https://github.com/mandlos/CAPRMIL
翻译:在计算病理学中,由于全切片图像(WSI)的千兆像素尺度及像素级标注的稀缺性,弱监督已成为深度学习的标准范式,其中多示例学习(MIL)被确立为切片级模型训练的主要框架。本文受神经偏微分方程(PDE)求解器研究的启发,为MIL方法提出了一种新颖的设置。我们摒弃了依赖复杂基于注意力的聚合机制,提出了一种高效、与聚合器无关的框架,将相关性学习的复杂性从MIL聚合器中剥离。CAPRMIL能够生成丰富的上下文感知补丁嵌入,从而促进下游任务中有效的相关性学习。该方法通过将使用冻结补丁编码器提取的补丁特征投影到一小组全局上下文/形态感知令牌中,并利用多头自注意力机制,以相对于包大小的线性计算复杂度注入全局上下文。配合简单的均值MIL聚合器,CAPRMIL在多个公开病理学基准测试中达到了最先进的切片级性能,同时相较于最先进的MIL方法,可训练参数总量减少了48%-92.8%,推理期间的浮点运算次数(FLOPs)降低了52%-99%,并在GPU内存效率和训练时间方面位列最佳模型之一。我们的结果表明,在聚合之前学习丰富、上下文感知的实例表示,是进行全切片分析时替代复杂池化操作的一种有效且可扩展的方案。代码发布于 https://github.com/mandlos/CAPRMIL