Lung cancer continues to be the leading cause of cancer-related deaths globally. Early detection and diagnosis of pulmonary nodules are essential for improving patient survival rates. Although previous research has integrated multimodal and multi-temporal information, outperforming single modality and single time point, the fusion methods are limited to inefficient vector concatenation and simple mutual attention, highlighting the need for more effective multimodal information fusion. To address these challenges, we introduce a Dual-Graph Spatiotemporal Attention Network, which leverages temporal variations and multimodal data to enhance the accuracy of predictions. Our methodology involves developing a Global-Local Feature Encoder to better capture the local, global, and fused characteristics of pulmonary nodules. Additionally, a Dual-Graph Construction method organizes multimodal features into inter-modal and intra-modal graphs. Furthermore, a Hierarchical Cross-Modal Graph Fusion Module is introduced to refine feature integration. We also compiled a novel multimodal dataset named the NLST-cmst dataset as a comprehensive source of support for related research. Our extensive experiments, conducted on both the NLST-cmst and curated CSTL-derived datasets, demonstrate that our DGSAN significantly outperforms state-of-the-art methods in classifying pulmonary nodules with exceptional computational efficiency.
翻译:肺癌仍然是全球癌症相关死亡的主要原因。肺结节的早期检测与诊断对于提高患者生存率至关重要。尽管先前研究已整合多模态与多时序信息,其性能优于单模态与单时间点方法,但现有融合方法仅限于低效的向量拼接与简单的互注意力机制,凸显了对更有效的多模态信息融合技术的需求。为应对这些挑战,我们提出一种双图时空注意力网络,该网络利用时序变化与多模态数据以提升预测准确性。我们的方法包括开发一种全局-局部特征编码器,以更好地捕捉肺结节的局部、全局及融合特征。此外,一种双图构建方法将多模态特征组织为模态间图与模态内图。进一步地,我们引入一种分层跨模态图融合模块以优化特征整合。我们还构建了一个名为NLST-cmst数据集的新型多模态数据集,为相关研究提供全面的支持资源。我们在NLST-cmst数据集及精心构建的CSTL衍生数据集上进行了大量实验,结果表明我们的DGSAN在肺结节分类任务上显著优于现有先进方法,并展现出卓越的计算效率。