To address the problem of medical image recognition, computer vision techniques like convolutional neural networks (CNN) are frequently used. Recently, 3D CNN-based models dominate the field of magnetic resonance image (MRI) analytics. Due to the high similarity between MRI data and videos, we conduct extensive empirical studies on video recognition techniques for MRI classification to answer the questions: (1) can we directly use video recognition models for MRI classification, (2) which model is more appropriate for MRI, (3) are the common tricks like data augmentation in video recognition still useful for MRI classification? Our work suggests that advanced video techniques benefit MRI classification. In this paper, four datasets of Alzheimer's and Parkinson's disease recognition are utilized in experiments, together with three alternative video recognition models and data augmentation techniques that are frequently applied to video tasks. In terms of efficiency, the results reveal that the video framework performs better than 3D-CNN models by 5% - 11% with 50% - 66% less trainable parameters. This report pushes forward the potential fusion of 3D medical imaging and video understanding research.
翻译:为解决医学图像识别问题,经常使用3DCNN的3DCNN模型来控制磁共振图像分析领域。由于磁共振数据与视频高度相似,我们对磁共振数据与视频的视频识别技术进行了广泛的经验研究,以解答问题:(1) 我们能否直接使用视频识别模型来进行MRI分类,(2) 这种模型更适合MRI,(3) 常见的技巧,如视频识别中的数据增强,仍然对MRI分类有用?我们的工作表明,高级视频技术有利于MRI分类。在本文件中,在实验中使用了四个关于阿尔茨海默氏病和帕金森病识别的数据集,以及三个替代视频识别模型和数据增强技术,经常用于视频任务。在效率方面,结果显示,视频框架比3D-CNN模型效果要好5%-11%,培训参数要低50%-66%。该报告推进了3D医学成像和视频理解研究的潜在融合。</s>