Automated analysis of chest radiography using deep learning has tremendous potential to enhance the clinical diagnosis of diseases in patients. However, deep learning models typically require large amounts of annotated data to achieve high performance -- often an obstacle to medical domain adaptation. In this paper, we build a data-efficient learning framework that utilizes radiology reports to improve medical image classification performance with limited labeled data (fewer than 1000 examples). Specifically, we examine image-captioning pretraining to learn high-quality medical image representations that train on fewer examples. Following joint pretraining of a convolutional encoder and transformer decoder, we transfer the learned encoder to various classification tasks. Averaged over 9 pathologies, we find that our model achieves higher classification performance than ImageNet-supervised and in-domain supervised pretraining when labeled training data is limited.
翻译:深度学习自动分析胸部X光片在提高疾病临床诊断方面具有巨大潜力,但深度学习模型通常需要大量注释数据才能达到高性能,这往往是医学领域适应性的障碍。在本文中,我们建立了一个数据高效的学习框架,利用放射学报告以少量标记数据(少于1000个)提高医学图像分类性能。具体而言,我们研究了图像字幕预训练,以学习高质量的医学图像表示,这些图像表示可以用更少的示例进行训练。在卷积编码器和变压器解码器的联合预训练之后,我们将所学的编码器转移到各种分类任务中。在9种病理学方面的平均值中,我们发现当标记的训练数据有限时,我们的模型比ImageNet监督和领域内监督预训练实现更高的分类性能。