单一血细胞图像分类的无监督的跨域地物采掘 (Unsupervised Cross-Domain Feature Extraction for Single Blood Cell Image Classification)

Diagnosing hematological malignancies requires identification and classification of white blood cells in peripheral blood smears. Domain shifts caused by different lab procedures, staining, illumination, and microscope settings hamper the re-usability of recently developed machine learning methods on data collected from different sites. Here, we propose a cross-domain adapted autoencoder to extract features in an unsupervised manner on three different datasets of single white blood cells scanned from peripheral blood smears. The autoencoder is based on an R-CNN architecture allowing it to focus on the relevant white blood cell and eliminate artifacts in the image. To evaluate the quality of the extracted features we use a simple random forest to classify single cells. We show that thanks to the rich features extracted by the autoencoder trained on only one of the datasets, the random forest classifier performs satisfactorily on the unseen datasets, and outperforms published oracle networks in the cross-domain task. Our results suggest the possibility of employing this unsupervised approach in more complicated diagnosis and prognosis tasks without the need to add expensive expert labels to unseen data.

翻译：血清恶性诊断要求确认和分类周边血液涂片中的白血细胞。不同实验室程序、污点、光化和显微镜设置导致的域变妨碍最近开发的关于从不同地点收集的数据的机器学习方法的可重新使用性。这里, 我们提出一个跨域经改造的自动编码器, 以不受监督的方式提取从周边血液涂片中扫描的单一白细胞的三个不同数据集的特征。自动编码器基于一个 R- CNN 结构, 使其能够关注相关的白血细胞, 并消除图像中的文物。为了评估提取的特征的质量, 我们使用简单的随机森林来对单个细胞进行分类。我们表明,由于只对一个数据集进行自动编码培训的自动编码器所提取的丰富特征, 随机的森林分类器在未知的数据集上表现令人满意, 超越了跨域任务中已公布的孔径网络。我们的结果表明, 有可能在更复杂的诊断和剖析工作中采用这种未超的处理方法, 而不需要增加昂贵的专家标签。

相关内容

自编码器

关注 140

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日