Abdominal organ segmentation has many important clinical applications, such as organ quantification, surgical planning, and disease diagnosis. However, manually annotating organs from CT scans is time-consuming and labor-intensive. Semi-supervised learning has shown the potential to alleviate this challenge by learning from a large set of unlabeled images and limited labeled samples. In this work, we follow the self-training strategy and employ a high-performance hybrid architecture (PHTrans) consisting of CNN and Swin Transformer for the teacher model to generate precise pseudo labels for unlabeled data. Afterward, we introduce them with labeled data together into a two-stage segmentation framework with lightweight PHTrans for training to improve the performance and generalization ability of the model while remaining efficient. Experiments on the validation set of FLARE2022 demonstrate that our method achieves excellent segmentation performance as well as fast and low-resource model inference. The average DSC and NSD are 0.8956 and 0.9316, respectively. Under our development environments, the average inference time is 18.62 s, the average maximum GPU memory is 1995.04 MB, and the area under the GPU memory-time curve and the average area under the CPU utilization-time curve are 23196.84 and 319.67. The code is available at https://github.com/lseventeen/FLARE22-TwoStagePHTrans.
翻译:在这项工作中,我们遵循自我培训战略,并采用高性能混合结构(PHTrans),由CNN和Swin变异器组成,教师模型为未贴标签的数据制作精确的假标签。之后,我们将它们与标记的CT扫描器官一起引入两阶段分解框架,同时采用轻量级的PHTrans,用于培训,以提高模型的性能和普及能力,同时保持效率。FLARE2022的验证集实验显示,我们的方法具有极好的分解性,以及快速和低资源模型。在我们的开发环境中,平均DSC和NSD分别为0.8956和0.9316。在我们的开发环境中,平均时间为18.62秒,平均GPU记忆量为18.62秒,平均PHTrans,用于培训以提高模型的性能和一般化能力,同时保持效率。FLARE2022的实验显示,我们的方法取得了极好的分解性性性工作,以及快速和低资源模型。在GSDSDS和NSD中,平均为0.956和0.9316。在我们的开发环境中,平均时间为1862秒,平均GPU-PU的GPU记忆-de-de-rude-rudeal为1995年中的平均时间为C。