Neural networks pre-trained on a self-supervision scheme have become the standard when operating in data rich environments with scarce annotations. As such, fine-tuning a model to a downstream task in a parameter-efficient but effective way, e.g. for a new set of classes in the case of semantic segmentation, is of increasing importance. In this work, we propose and investigate several contributions to achieve a parameter-efficient but effective adaptation for semantic segmentation on two medical imaging datasets. Relying on the recently popularized prompt tuning approach, we provide a prompt-able UNet (PUNet) architecture, that is frozen after pre-training, but adaptable throughout the network by class-dependent learnable prompt tokens. We pre-train this architecture with a dedicated dense self-supervision scheme based on assignments to online generated prototypes (contrastive prototype assignment, CPA) of a student teacher combination alongside a concurrent segmentation loss on a subset of classes. We demonstrate that the resulting neural network model is able to attenuate the gap between fully fine-tuned and parameter-efficiently adapted models on CT imaging datasets. As such, the difference between fully fine-tuned and prompt-tuned variants amounts to only 3.83 pp for the TCIA/BTCV dataset and 2.67 pp for the CT-ORG dataset in the mean Dice Similarity Coefficient (DSC, in %) while only prompt tokens, corresponding to 0.85% of the pre-trained backbone model with 6.8M frozen parameters, are adjusted. The code for this work is available on https://github.com/marcdcfischer/PUNet .
翻译:在自我监督计划上接受过训练的神经网络在自我监督计划下,在数据丰富的环境中运行时已成为标准。因此,以参数高效但有效的方式,如在语义分解的情况下,将模型微调到下游任务,例如,对于一组新的语义分解,则越来越重要。在这项工作中,我们提议并调查几项贡献,以实现一个参数高效但有效的调整,以适应两个医学成像数据集的语义分解。我们依靠最近普及的快速调试方法,我们提供了一个快速的UNet(PUNet)架构,该架构在培训前被冻结,但在整个网络中通过基于课堂的可学习快速标志对下游任务进行调整。我们预先调整这一架构,根据对在线生成的原型(调原型任务、CPA)进行专门密集的自我监督计划,同时对分级之前的语系分解损失进行。我们证明,由此产生的神经网络模型只能缩小完全精确调整和节能调整的UNET(PET)的UNET(P)参数框架(PNet),在培训前被冻结,但在整个网络中通过基于课堂可学习的快速的标志符号代谢的标志符号分解数据数3的TRA(C)之间,这种精确的数据变换数据数量。