Automated chromosome instance segmentation from metaphase cell microscopic images is critical for the diagnosis of chromosomal disorders (i.e., karyotype analysis). However, it is still a challenging task due to lacking of densely annotated datasets and the complicated morphologies of chromosomes, e.g., dense distribution, arbitrary orientations, and wide range of lengths. To facilitate the development of this area, we take a big step forward and manually construct a large-scale densely annotated dataset named AutoKary2022, which contains over 27,000 chromosome instances in 612 microscopic images from 50 patients. Specifically, each instance is annotated with a polygonal mask and a class label to assist in precise chromosome detection and segmentation. On top of it, we systematically investigate representative methods on this dataset and obtain a number of interesting findings, which helps us have a deeper understanding of the fundamental problems in chromosome instance segmentation. We hope this dataset could advance research towards medical understanding. The dataset can be available at: https://github.com/wangjuncongyu/chromosome-instance-segmentation-dataset.
翻译:由于缺乏密集标注数据集和染色体形态的复杂性(例如,密集分布,任意方向和广泛长度),从中期细胞显微图像自动化染色体实例分割对于染色体疾病的诊断(即核型分析)仍然是一个具有挑战性的任务。为了促进这个领域的发展,我们迈出了一大步并手动构建了一个名为AutoKary2022的大规模密集标注数据集,其中包含50个患者的612个显微图像中的超过27,000个染色体实例。具体而言,每个实例都用多边形掩膜和类标签进行注释,以帮助进行精确的染色体检测和分割。在此基础上,我们在该数据集上系统地研究了代表性方法,并获得了一些有趣的发现,这有助于我们更深入地了解染色体实例分割中的基本问题。我们希望这个数据集能够推进医学理解的研究。该数据集可以在以下链接中获得:https://github.com/wangjuncongyu/chromosome-instance-segmentation-dataset.