Multi-modal representation methods have achieved advanced performance in medical applications by extracting more robust features from multi-domain data. However, existing methods usually need to train additional branches for downstream tasks, which may increase the model complexities in clinical applications as well as introduce additional human inductive bias. Besides, very few studies exploit the rich clinical knowledge embedded in clinical daily reports. To this end, we propose a novel medical generalist agent, MGA, that can address three kinds of common clinical tasks via clinical reports knowledge transformation. Unlike the existing methods, MGA can easily adapt to different tasks without specific downstream branches when their corresponding annotations are missing. More importantly, we are the first attempt to use medical professional language guidance as a transmission medium to guide the agent's behavior. The proposed method is implemented on four well-known X-ray open-source datasets, MIMIC-CXR, CheXpert, MIMIC-CXR-JPG, and MIMIC-CXR-MS. Promising results are obtained, which validate the effectiveness of our proposed MGA. Code is available at: https://github.com/SZUHvern/MGA
翻译:多模式代表方法在医疗应用方面取得了先进的表现,从多域数据中提取了更强有力的特征,但是,现有方法通常需要为下游任务培训更多的分支,这可能会增加临床应用的模型复杂性,并引入更多的人类感化偏见。此外,很少有研究利用临床日报中丰富的临床知识。为此,我们提议建立一个新型的医学通才代理,即MGA, 它可以通过临床报告知识转化处理三种共同的临床任务。与现有方法不同,MGA可以很容易地适应不同的任务,而不需要具体的下游分支的相应说明。更重要的是,我们第一次尝试使用医学专业语言指导作为传播媒介来指导该代理的行为。拟议方法针对四个众所周知的X射线开源数据集、MIMIC-CXR、CheXpert、MIMIC-CXR-JPG和MIMIMIMIC-CXR-MS。取得了令人乐观的结果,这证实了我们提议的MGA的有效性。代码见:https://github.com/SZUHve/MGA.MGA。</s>