Multimodal clinical reasoning in the field of gastrointestinal (GI) oncology necessitates the integrated interpretation of endoscopic imagery, radiological data, and biochemical markers. Despite the evident potential exhibited by Multimodal Large Language Models (MLLMs), they frequently encounter challenges such as context dilution and hallucination when confronted with intricate, heterogeneous medical histories. In order to address these limitations, a hierarchical Multi-Agent Framework is proposed, which emulates the collaborative workflow of a human Multidisciplinary Team (MDT). The system attained a composite expert evaluation score of 4.60/5.00, thereby demonstrating a substantial improvement over the monolithic baseline. It is noteworthy that the agent-based architecture yielded the most substantial enhancements in reasoning logic and medical accuracy. The findings indicate that mimetic, agent-based collaboration provides a scalable, interpretable, and clinically robust paradigm for automated decision support in oncology.
翻译:胃肠道肿瘤领域的多模态临床推理需要对内窥镜图像、放射学数据和生化标志物进行综合解读。尽管多模态大语言模型展现出显著潜力,但在处理复杂、异质的病史时,常面临语境稀释和幻觉生成等挑战。为应对这些局限,本研究提出一种分层多智能体框架,模拟人类多学科诊疗团队的协作工作流程。该系统获得4.60/5.00的综合专家评分,较单一基线模型实现显著提升。值得注意的是,基于智能体的架构在推理逻辑和医学准确性方面带来最显著的改进。研究结果表明,仿生多智能体协作为肿瘤学自动化决策支持提供了可扩展、可解释且临床鲁棒的新范式。