Segmentation of liver structures in multi-phase contrast-enhanced computed tomography (CECT) plays a crucial role in computer-aided diagnosis and treatment planning for liver diseases, including tumor detection. In this study, we investigate the performance of UNet-based architectures for liver tumor segmentation, starting from the original UNet and extending to UNet3+ with various backbone networks. We evaluate ResNet, Transformer-based, and State-space (Mamba) backbones, all initialized with pretrained weights. Surprisingly, despite the advances in modern architecture, ResNet-based models consistently outperform Transformer- and Mamba-based alternatives across multiple evaluation metrics. To further improve segmentation quality, we introduce attention mechanisms into the backbone and observe that incorporating the Convolutional Block Attention Module (CBAM) yields the best performance. ResNetUNet3+ with CBAM module not only produced the best overlap metrics with a Dice score of 0.755 and IoU of 0.662, but also achieved the most precise boundary delineation, evidenced by the lowest HD95 distance of 77.911. The model's superiority was further cemented by its leading overall accuracy of 0.925 and specificity of 0.926, showcasing its robust capability in accurately identifying both lesion and healthy tissue. To further enhance interpretability, Grad-CAM visualizations were employed to highlight the region's most influential predictions, providing insights into its decision-making process. These findings demonstrate that classical ResNet architecture, when combined with modern attention modules, remain highly competitive for medical image segmentation tasks, offering a promising direction for liver tumor detection in clinical practice.
翻译:多期相增强计算机断层扫描(CECT)中肝脏结构的分割在肝脏疾病(包括肿瘤检测)的计算机辅助诊断与治疗规划中起着关键作用。本研究探讨了基于UNet的架构在肝脏肿瘤分割中的性能,从原始UNet扩展到具有不同骨干网络的UNet3+。我们评估了以预训练权重初始化的ResNet、基于Transformer的骨干网络以及状态空间(Mamba)骨干网络。令人惊讶的是,尽管现代架构有所进展,基于ResNet的模型在多项评估指标上始终优于基于Transformer和Mamba的替代方案。为进一步提升分割质量,我们在骨干网络中引入注意力机制,并观察到加入卷积块注意力模块(CBAM)能获得最佳性能。配备CBAM模块的ResNetUNet3+不仅取得了最佳重叠度量(Dice分数为0.755,IoU为0.662),还实现了最精确的边界描绘(HD95距离最低,为77.911)。该模型以领先的整体准确率0.925和特异性0.926进一步巩固了其优越性,展现了其在准确识别病变与健康组织方面的稳健能力。为增强可解释性,研究采用Grad-CAM可视化技术突出显示对预测最具影响力的区域,从而揭示其决策过程。这些发现表明,经典ResNet架构与现代注意力模块结合后,在医学图像分割任务中仍具有高度竞争力,为临床实践中的肝脏肿瘤检测提供了有前景的方向。