Modern studies in radiograph representation learning rely on either self-supervision to encode invariant semantics or associated radiology reports to incorporate medical expertise, while the complementarity between them is barely noticed. To explore this, we formulate the self- and report-completion as two complementary objectives and present a unified framework based on masked record modeling (MRM). In practice, MRM reconstructs masked image patches and masked report tokens following a multi-task scheme to learn knowledge-enhanced semantic representations. With MRM pre-training, we obtain pre-trained models that can be well transferred to various radiography tasks. Specifically, we find that MRM offers superior performance in label-efficient fine-tuning. For instance, MRM achieves 88.5% mean AUC on CheXpert using 1% labeled data, outperforming previous R$^2$L methods with 100% labels. On NIH ChestX-ray, MRM outperforms the best performing counterpart by about 3% under small labeling ratios. Besides, MRM surpasses self- and report-supervised pre-training in identifying the pneumonia type and the pneumothorax area, sometimes by large margins.
翻译:在射电代表制学习中,现代研究依靠自我监督来编码变异语义或相关的放射学报告,以纳入医学专门知识,而两者之间的互补性却几乎没有引起注意。为了探讨这一点,我们将自我和报告完成作为两个互补目标,并基于隐形记录模型(MRM)提出统一框架。在实践中,MRM在多任务计划之后重建了蒙面图像补丁和蒙面报告符号,以学习知识强化语义表。在MRM培训前,我们获得了能够很好地转移到各种放射学任务的培训前模型。具体地说,我们发现MRM在标签效率微调方面表现优异。例如,MRM在使用1%的标签数据在CheXpert上实现了88.5%的AUC,比以前100%的R$2美元的方法表现得更好。在NIH ChestX射线上,MRMM在小型标签比率下比最好的对应方高出大约3%。此外,MRM超过自我和报告监督前的幅度,有时在确定气压类型和高空域中确定气压区。