Although electrocardiograms (ECG) play a dominant role in cardiovascular diagnosis and treatment, their intrinsic data forms and representational patterns pose significant challenges for medical multimodal large language models (Med-MLLMs) in achieving cross-modal semantic alignment. To address this gap, we propose Heartcare Suite, a unified ECG suite designed for dual signal-image modeling and understanding. (i) Heartcare-400K: We build a finegrained ECG instruction dataset on top of our data pipeline engine--HeartAgent--by integrating 12,170 high quality clinical ECG reports from top hospitals with open-source data; (ii) Heartcare-Bench: a systematic benchmark assessing performance of models in multi-perspective ECG understanding and cross-modal generalization, providing guidance for optimizing ECG comprehension models; (iii) HeartcareGPT: built upon a structure-aware discrete tokenizer Beat, we propose the DSPA (Dual Stream Projection Alignment) paradigm--a dual encoder projection alignment mechanism enabling joint optimizing and modeling native ECG signal-image within a shared feature space. Heartcare achieves consistent improvements across diverse ECG understanding tasks, validating both the effectiveness of the unified modeling paradigm and the necessity of a high-quality data pipeline, and establishing a methodological foundation for extending Med-MLLMs toward physiological signal domains. Our project is available at https://github.com/DCDmllm/Heartcare-Suite .
翻译:尽管心电图在心血管疾病的诊断与治疗中占据主导地位,但其固有的数据形态与表征模式为医学多模态大语言模型实现跨模态语义对齐带来了重大挑战。为弥补这一差距,我们提出了Heartcare Suite,一个专为双模态信号-图像建模与理解设计的一体化心电图套件。该套件包含三个核心部分:(i)Heartcare-400K:我们基于数据流水线引擎HeartAgent,通过整合来自顶尖医院的12,170份高质量临床心电图报告与开源数据,构建了一个细粒度的心电图指令数据集;(ii)Heartcare-Bench:一个系统性基准测试,用于评估模型在多视角心电图理解与跨模态泛化方面的性能,为优化心电图理解模型提供指导;(iii)HeartcareGPT:基于结构感知的离散分词器Beat,我们提出了DSPA范式——一种双编码器投影对齐机制,能够在共享特征空间内对原生心电图信号与图像进行联合优化与建模。Heartcare在多种心电图理解任务中均取得了稳定提升,验证了一体化建模范式的有效性以及高质量数据流水线的必要性,并为将医学多模态大语言模型扩展至生理信号领域奠定了方法论基础。本项目发布于https://github.com/DCDmllm/Heartcare-Suite。