An automated segmentation and classification of nuclei is an essential task in digital pathology. The current deep learning-based approaches require a vast amount of annotated datasets by pathologists. However, the existing datasets are imbalanced among different types of nuclei in general, leading to a substantial performance degradation. In this paper, we propose a simple but effective data augmentation technique, termed GradMix, that is specifically designed for nuclei segmentation and classification. GradMix takes a pair of a major-class nucleus and a rare-class nucleus, creates a customized mixing mask, and combines them using the mask to generate a new rare-class nucleus. As it combines two nuclei, GradMix considers both nuclei and the neighboring environment by using the customized mixing mask. This allows us to generate realistic rare-class nuclei with varying environments. We employed two datasets to evaluate the effectiveness of GradMix. The experimental results suggest that GradMix is able to improve the performance of nuclei segmentation and classification in imbalanced pathology image datasets.
翻译:在数字病理学中,对核进行自动分解和分类是一项基本任务。 目前的深层次学习方法需要病理学家提供大量附加说明的数据集。 但是,现有的数据集在一般不同类型核心之间不平衡,导致显著性性能退化。 在本文中,我们提出了一个简单而有效的数据增强技术,称为GradMix, 专门为核分解和分类设计。 GradMix 使用一对主要类核和一个稀有类核, 制作一个定制的混合掩码, 并结合它们使用掩码生成一个新的稀有类核。 由于它结合了两个核心, GradMix 使用定制的混合掩码来考虑核和邻近环境。 这使我们能够生成现实而有效的稀有类核, 且环境不同。 我们使用两个数据集来评估 gradMix 的有效性。 实验结果表明, GradMix 能够改进核分解和分类功能, 在不平衡的病理图像集中, 。