Accurate morphological classification of white blood cells (WBCs) is an important step in the diagnosis of leukemia, a disease in which nonfunctional blast cells accumulate in the bone marrow. Recently, deep convolutional neural networks (CNNs) have been successfully used to classify leukocytes by training them on single-cell images from a specific domain. Most CNN models assume that the distributions of the training and test data are similar, i.e., that the data are independently and identically distributed. Therefore, they are not robust to different staining protocols, magnifications, resolutions, scanners, or imaging protocols, as well as variations in clinical centers or patient cohorts. In addition, domain-specific data imbalances affect the generalization performance of classifiers. Here, we train a robust CNN for WBC classification by addressing cross-domain data imbalance and domain shifts. To this end, we use two loss functions and demonstrate the effectiveness on out-of-distribution (OOD) generalization. Our approach achieves the best F1 macro score compared to other existing methods, and is able to consider rare cell types. This is the first demonstration of imbalanced domain generalization in hematological cytomorphology and paves the way for robust single cell classification methods for the application in laboratories and clinics.
翻译:白血球(BBCs)的精确形态分类是诊断白血病的重要一步,白血病是一种非功能性爆炸细胞在骨髓中积累的疾病。最近,通过对白血球进行特定领域的单细胞图像培训,已经成功地利用深革命性神经网络(CNNs)对白血球进行分类。大多数CNN模型认为,培训和测试数据的分布相似,即数据是独立和相同的分布。因此,它们对于不同的污点协议、放大、分辨率、扫描仪或成像协议,以及临床中心或病人组群的变异,并不强大。此外,具体领域的数据不平衡影响分类者的通用性性能。我们在这里通过处理跨多部数据不平衡和域变换来培训强大的CNN为白血球分类。我们为此使用两个损失功能,并展示了分配外(OOD)一般化的效果。我们的方法与其他现有方法相比,达到了最佳的F1宏观评分,并且能够考虑稀有的细胞类中心或病人组群中的变异性。此外,在普通实验室中,我们第一次演示了稀有的分类。</s>