Multiple instance learning exhibits a powerful approach for whole slide image-based diagnosis in the absence of pixel- or patch-level annotations. In spite of the huge size of hole slide images, the number of individual slides is often rather small, leading to a small number of labeled samples. To improve training, we propose and investigate different data augmentation strategies for multiple instance learning based on the idea of linear interpolations of feature vectors (known as MixUp). Based on state-of-the-art multiple instance learning architectures and two thyroid cancer data sets, an exhaustive study is conducted considering a range of common data augmentation strategies. Whereas a strategy based on to the original MixUp approach showed decreases in accuracy, the use of a novel intra-slide interpolation method led to consistent increases in accuracy.
翻译:多实例学习显示,在没有像素或补丁级说明的情况下,对整个幻灯片图像诊断采取了强有力的方法。尽管洞状幻灯片的大小很大,但单张幻灯片的数量往往相当小,导致有标签的样本数量很少。为了改进培训,我们提议并调查基于特征矢量线性插图(称为MixUp)概念的多种实例学习的不同数据增强战略。根据最先进的多例学习架构和两套甲状腺癌数据集,正在进行一项详尽的研究,以考虑一系列共同的数据增强战略。而基于最初的混合方法的战略显示准确性下降,使用新的流状内插图方法导致准确性持续提高。