In hospitals, data are siloed to specific information systems that make the same information available under different modalities such as the different medical imaging exams the patient undergoes (CT scans, MRI, PET, Ultrasound, etc.) and their associated radiology reports. This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time. In this paper, we propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time, using recent advances in mutual information maximization. By maximizing cross-modal information at train time, we are able to outperform several state-of-the-art baselines in two different settings, medical image classification, and segmentation. In particular, our method is shown to have a strong impact on the inference-time performance of weaker modalities.
翻译:在医院,数据被集中到具体的信息系统中,以不同的方式,例如病人接受的不同医学成像检查(CT扫描、MRI、PET、超声波等)及其相关的放射学报告,提供相同的信息,为在火车上获取和使用在试验时间不一定总能获得的相同信息的多重观点提供了独特的机会。在本文件中,我们提出了一个创新框架,通过学习一种适应测试时间下降模式的多模式投入的良好表现,利用最近相互信息最大化的进展,来充分利用现有数据。通过在火车时间尽量扩大跨模式信息,我们能够在两种不同的环境下,即医学形象分类和分化,超越若干最先进的基线。特别是,我们的方法对较弱模式的推论期表现产生了重大影响。