Vit-V-V-Net:不受监督的活量医学图像登记愿景变异器 (ViT-V-Net: Vision Transformer for Unsupervised Volumetric Medical Image Registration)

In the last decade, convolutional neural networks (ConvNets) have dominated and achieved state-of-the-art performances in a variety of medical imaging applications. However, the performances of ConvNets are still limited by lacking the understanding of long-range spatial relations in an image. The recently proposed Vision Transformer (ViT) for image classification uses a purely self-attention-based model that learns long-range spatial relations to focus on the relevant parts of an image. Nevertheless, ViT emphasizes the low-resolution features because of the consecutive downsamplings, result in a lack of detailed localization information, making it unsuitable for image registration. Recently, several ViT-based image segmentation methods have been combined with ConvNets to improve the recovery of detailed localization information. Inspired by them, we present ViT-V-Net, which bridges ViT and ConvNet to provide volumetric medical image registration. The experimental results presented here demonstrate that the proposed architecture achieves superior performance to several top-performing registration methods.

翻译：过去十年来,进化神经网络(Conval neal network)在各种医学成像应用中占主导地位并取得了最先进的性能,然而,ConvNet的性能仍然有限,因为对图像中的远程空间关系缺乏了解。最近为图像分类而提议的视觉变异器(ViT)使用一个纯粹以自我注意为基础的模型,该模型学习长距离空间关系,以关注图像的相关部分。然而,ViT强调由于连续的下行抽样而导致的低分辨率特征,导致缺乏详细的本地化信息,使其不适合图像登记。最近,一些基于ViT的图像分割方法与ConvNet相结合,以改善详细的本地化信息的恢复。根据这些方法,我们介绍了ViT-V-Net,它将V-V-Net连接V-ConvNet连接维特和ConvNet,以提供体积医学图像登记。这里的实验结果显示,拟议的结构取得了优异于几个业绩顶尖的登记方法。

相关内容

图像配准

关注 810

图像配准是图像处理研究领域中的一个典型问题和技术难点，其目的在于比较或融合针对同一对象在不同条件下获取的图像，例如图像会来自不同的采集设备，取自不同的时间，不同的拍摄视角等等，有时也需要用到针对不同对象的图像配准问题。具体地说，对于一组图像数据集中的两幅图像，通过寻找一种空间变换把一幅图像映射到另一幅图像，使得两图中对应于空间同一位置的点一一对应起来，从而达到信息融合的目的。该技术在计算机视觉、医学图像处理以及材料力学等领域都具有广泛的应用。根据具体应用的不同，有的侧重于通过变换结果融合两幅图像，有的侧重于研究变换本身以获得对象的一些力学属性。

【UC伯克利】自监督视觉表示学习，356页ppt，Self-Supervised Visual Learning

专知会员服务

66+阅读 · 2021年1月10日

【Manning新书】现代Java实战，592页pdf

专知会员服务

101+阅读 · 2020年5月22日

【CVPR2020-微软-CMU】视频物体分割的一种直推方法，Video Object Segmentation

专知会员服务

7+阅读 · 2020年4月16日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日