The transformer has dominated the natural language processing (NLP) field for a long time. Recently, the transformer-based method has been adopted into the computer vision (CV) field and shows promising results. As an important branch of the CV field, medical image analysis joins the wave of the transformer-based method rightfully. In this review, we illustrate the principle of the attention mechanism, and the detailed structures of the transformer, and depict how the transformer is adopted into medical image analysis. We organize the transformer-based medical image analysis applications in a sequence of different tasks, including classification, segmentation, synthesis, registration, localization, detection, captioning, and denoising. For the mainstream classification and segmentation tasks, we further divided the corresponding works based on different medical imaging modalities. The datasets corresponding to the related works are also organized. We include thirteen modalities and more than twenty objects in our work.
翻译:变压器长期以来一直主导着自然语言处理( NLP) 字段。 最近, 以变压器为基础的方法被应用到计算机视觉( CV) 字段中, 并展示出有希望的结果 。 作为CV 字段的一个重要分支, 医学图像分析会正确地结合以变压器为基础的方法波。 在本次审查中, 我们演示了注意机制的原则和变压器的详细结构, 并描述了变压器是如何被采纳为医学图像分析的 。 我们按照不同的任务顺序组织变压器医学图像分析应用, 包括分类、 分解、 合成、 注册、 本地化、 检测、 说明 和 解调等。 对于主流分类和分解任务, 我们进一步根据不同的医学成像模式划分相应的工程。 相关工程的数据集也组织起来 。 我们在工作中包括13种模式和20多个对象 。