三维CT和MRI扫描的中间损失卷积神经网络超分辨率 (Convolutional Neural Networks with Intermediate Loss for 3D Super-Resolution of CT and MRI Scans)

CT scanners that are commonly-used in hospitals nowadays produce low-resolution images, up to 512 pixels in size. One pixel in the image corresponds to a one millimeter piece of tissue. In order to accurately segment tumors and make treatment plans, doctors need CT scans of higher resolution. The same problem appears in MRI. In this paper, we propose an approach for the single-image super-resolution of 3D CT or MRI scans. Our method is based on deep convolutional neural networks (CNNs) composed of 10 convolutional layers and an intermediate upscaling layer that is placed after the first 6 convolutional layers. Our first CNN, which increases the resolution on two axes (width and height), is followed by a second CNN, which increases the resolution on the third axis (depth). Different from other methods, we compute the loss with respect to the ground-truth high-resolution output right after the upscaling layer, in addition to computing the loss after the last convolutional layer. The intermediate loss forces our network to produce a better output, closer to the ground-truth. A widely-used approach to obtain sharp results is to add Gaussian blur using a fixed standard deviation. In order to avoid overfitting to a fixed standard deviation, we apply Gaussian smoothing with various standard deviations, unlike other approaches. We evaluate our method in the context of 2D and 3D super-resolution of CT and MRI scans from two databases, comparing it to relevant related works from the literature and baselines based on various interpolation schemes, using 2x and 4x scaling factors. The empirical results show that our approach attains superior results to all other methods. Moreover, our human annotation study reveals that both doctors and regular annotators chose our method in favor of Lanczos interpolation in 97.55% cases for 2x upscaling factor and in 96.69% cases for 4x upscaling factor.

翻译：CT扫描仪常用于医院，产生最多512像素大小的低分辨率图像。图像中的一个像素对应一毫米的组织。为了准确地分割肿瘤并制定治疗计划，医生需要更高分辨率的CT扫描。MRI中也存在同样的问题。在本文中，我们提出了一种用于单个图像的三维CT或MRI扫描超分辨率的方法。我们的方法基于由10个卷积层和一个位于前6个卷积层之后的中间上采样层组成的深度卷积神经网络（CNNs）。第一个CNN通过两个轴（宽度和高度）增加分辨率，其次是第二个CNN，通过第三个轴（深度）增加分辨率。与其他方法不同的是，我们在上采样层后立即针对“真实值”高分辨率输出计算损失函数，除了在最后一个卷积层之后计算损失函数之外。中间损失函数强制我们的网络产生更好、更接近“真实值”的输出。为了使结果更清晰，通常会使用固定标准偏差的高斯模糊。为了避免过度拟合到固定的标准偏差，我们使用不同的标准偏差进行高斯平滑处理，与其他方法不同。我们基于两个数据库评估了我们方法在2D和3D CT和MRI扫描的超分辨率上的表现，并与文献中的相关研究成果和基于不同插值方案的基线进行比较，使用2x和4x缩放因子。经验结果表明，我们的方法优于所有其他方法。此外，我们的人类注释研究揭示，医生和普通标注员都更倾向于选择我们的方法，而不是Lanczos插值，对于2x缩放因子，在97.55％的情况下，对于4x缩放因子，为96.69％的情况。