Multimodal deep learning has been used to predict clinical endpoints and diagnoses from clinical routine data. However, these models suffer from scaling issues: they have to learn pairwise interactions between each piece of information in each data type, thereby escalating model complexity beyond manageable scales. This has so far precluded a widespread use of multimodal deep learning. Here, we present a new technical approach of "learnable synergies", in which the model only selects relevant interactions between data modalities and keeps an "internal memory" of relevant data. Our approach is easily scalable and naturally adapts to multimodal data inputs from clinical routine. We demonstrate this approach on three large multimodal datasets from radiology and ophthalmology and show that it outperforms state-of-the-art models in clinically relevant diagnosis tasks. Our new approach is transferable and will allow the application of multimodal deep learning to a broad set of clinically relevant problems.
翻译:利用多式深层次学习来预测临床终点和临床常规数据的诊断,然而,这些模型也存在规模问题:它们必须学习每种数据类型中每一部分信息之间的对称互动,从而将模型复杂性提升到超出可管理的范围之外。这迄今阻碍了广泛使用多式深层次学习。在这里,我们提出了一种新的“可忽略的协同效应”技术方法,即模型只选择数据模式之间的相关互动,并保持相关数据的“内部记忆”。我们的方法很容易缩放,并且自然地适应临床常规中的多式数据输入。我们对放射学和眼科的三个大型多式数据集展示了这一方法,并表明它超过了临床相关诊断任务中最先进的模型。我们的新方法是可转让的,将允许将多式深层次学习应用于广泛的临床相关问题。