A key challenge in training neural networks for a given medical imaging task is often the difficulty of obtaining a sufficient number of manually labeled examples. In contrast, textual imaging reports, which are often readily available in medical records, contain rich but unstructured interpretations written by experts as part of standard clinical practice. We propose using these textual reports as a form of weak supervision to improve the image interpretation performance of a neural network without requiring additional manually labeled examples. We use an image-text matching task to train a feature extractor and then fine-tune it in a transfer learning setting for a supervised task using a small labeled dataset. The end result is a neural network that automatically interprets imagery without requiring textual reports during inference. This approach can be applied to any task for which text-image pairs are readily available. We evaluate our method on three classification tasks and find consistent performance improvements, reducing the need for labeled data by 67%-98%.
翻译:培训用于特定医疗成像任务的神经网络的一个关键挑战是往往难以获得足够数量的人工标签实例。相反,通常在医疗记录中随时可找到的文本成像报告包含专家作为标准临床实践的一部分所撰写的丰富但非结构化的解释。我们提议将这些文本报告作为一种薄弱的监督形式,以提高神经网络的图像判读性能,而不需要额外的人工标签实例。我们使用图像文本匹配任务来培训特征提取器,然后用一个小标签数据集在受监督任务的转移学习设置中对其进行微调。最终结果是神经网络自动解释图像,而无需在推断过程中提供文本报告。这个方法可以适用于任何可以随时提供文本成型配对的任务。我们评估了三种分类任务的方法,并找到了一致的性能改进,将标签数据的需要减少67%-98%。