Prostate cancer biopsy benefits from accurate fusion of transrectal ultrasound (TRUS) and magnetic resonance (MR) images. In the past few years, convolutional neural networks (CNNs) have been proved powerful in extracting image features crucial for image registration. However, challenging applications and recent advances in computer vision suggest that CNNs are quite limited in its ability to understand spatial correspondence between features, a task in which the self-attention mechanism excels. This paper aims to develop a self-attention mechanism specifically for cross-modal image registration. Our proposed cross-modal attention block effectively maps each of the features in one volume to all features in the corresponding volume. Our experimental results demonstrate that a CNN network designed with the cross-modal attention block embedded outperforms an advanced CNN network 10 times of its size. We also incorporated visualization techniques to improve the interpretability of our network. The source code of our work is available at https://github.com/DIAL-RPI/Attention-Reg .
翻译:近些年来,革命性神经网络(CNNs)在提取图像登记至关重要的图像特征方面被证明是强大的,然而,具有挑战性的应用和计算机视觉方面的最新进展表明,CNN在理解各功能之间的空间通信能力方面相当有限,这是自留机制所擅长的一项任务。本文件旨在开发一个专门用于跨现代图像登记的自留机制。我们提议的跨时关注块将每个特征都以一卷形式有效绘制到相应卷的所有特征上。我们的实验结果显示,设计了跨时关注块的CNN网络,其嵌入了高级CNN网络10倍的外形。我们还采用了视觉化技术来改进我们的网络的可解释性。我们工作的源代码可在https://github.com/DIAL-RPI/Atresti-Reg查阅。