Tumor segmentation in histopathology images is often complicated by its composition of different histological subtypes and class imbalance. Oversampling subtypes with low prevalence features is not a satisfactory solution since it eventually leads to overfitting. We propose to create synthetic images with semantically-conditioned deep generative networks and to combine subtype-balanced synthetic images with the original dataset to achieve better segmentation performance. We show the suitability of Generative Adversarial Networks (GANs) and especially diffusion models to create realistic images based on subtype-conditioning for the use case of HER2-stained histopathology. Additionally, we show the capability of diffusion models to conditionally inpaint HER2 tumor areas with modified subtypes. Combining the original dataset with the same amount of diffusion-generated images increased the tumor Dice score from 0.833 to 0.854 and almost halved the variance between the HER2 subtype recalls. These results create the basis for more reliable automatic HER2 analysis with lower performance variance between individual HER2 subtypes.
翻译:生理病理学图象中的肿瘤分解往往因其不同生理亚型和分类不平衡的构成而变得复杂。 过度抽样的低流行率亚型并不是令人满意的解决办法,因为它最终会导致超适应。 我们提议用精密的深层基因化网络制作合成图像,并将亚型平衡合成图像与原始数据集结合起来,以取得更好的分解性能。 我们显示了基因反versarial网络(GANs),特别是扩散模型的适宜性,以基于亚型设备创建现实的图像,用于HER2-与血科病理学有关的子类型。 此外,我们展示了扩散模型有条件地用改良的亚型将HER2肿瘤区域插入到有条件的亚型。将原始数据集与同样数量的扩散生成图像合并,将肿瘤Dice的分数从0.833提高到0.854,几乎将HER2亚型之间的差减半。这些结果为更可靠的自动HER2分析提供了基础,而单个HER2子类型之间的性能差异较小。