In medical applications, deep learning methods are built to automate diagnostic tasks. However, a clinically relevant question that practitioners usually face, is how to predict the future trajectory of a disease (prognosis). Current methods for such a problem often require domain knowledge, and are complicated to apply. In this paper, we formulate the prognosis prediction problem as a one-to-many forecasting problem from multimodal data. Inspired by a clinical decision-making process with two agents -- a radiologist and a general practitioner, we model a prognosis prediction problem with two transformer-based components that share information between each other. The first block in this model aims to analyze the imaging data, and the second block leverages the internal representations of the first one as inputs, also fusing them with auxiliary patient data. We show the effectiveness of our method in predicting the development of structural knee osteoarthritis changes over time. Our results show that the proposed method outperforms the state-of-the-art baselines in terms of various performance metrics. In addition, we empirically show that the existence of the multi-agent transformers with depths of 2 is sufficient to achieve good performances. Our code is publicly available at \url{https://github.com/MIPT-Oulu/CLIMAT}.
翻译:在医疗应用中,为自动诊断任务建立了深层次的学习方法。然而,实践者通常面临的一个与临床相关的问题是如何预测疾病的未来轨迹(预测性病),目前针对这一问题的方法往往需要领域知识,而且非常复杂。在本文中,我们从多式数据中将预测性病预测问题作为一种一到多式的预测问题进行阐述。在由两个代理 -- -- 放射学家和普通医生 -- -- 组成的临床决策过程的启发下,我们用两个基于变压器的部件来模拟预测性病预测问题,这两个部件彼此共享信息。这个模型的第一个块旨在分析成像数据,而第二个块则利用第一个问题的内部表述作为投入,并用辅助病人数据加以使用。我们展示了我们预测长期膝盖骨髓炎结构变化发展的方法的有效性。我们的结果显示,拟议的方法在各种性能指标方面超越了最新水平的基线。此外,我们从经验上表明,多试器变压器的存在及其2/MI/CLA/我们现有的高级性能足以实现公开性能。