Deep Learning (DL) and specifically CNN models have become a de facto method for a wide range of vision tasks, outperforming traditional machine learning (ML) methods. Consequently, they drew a lot of attention in the neuroimaging field in particular for phenotype prediction or computer-aided diagnosis. However, most of the current studies often deal with small single-site cohorts, along with a specific pre-processing pipeline and custom CNN architectures, which make them difficult to compare to. We propose an extensive benchmark of recent state-of-the-art (SOTA) 3D CNN, evaluating also the benefits of data augmentation and deep ensemble learning, on both Voxel-Based Morphometry (VBM) pre-processing and quasi-raw images. Experiments were conducted on a large multi-site 3D brain anatomical MRI data-set comprising N=10k scans on 3 challenging tasks: age prediction, sex classification, and schizophrenia diagnosis. We found that all models provide significantly better predictions with VBM images than quasi-raw data. This finding evolved as the training set approaches 10k samples where quasi-raw data almost reach the performance of VBM. Moreover, we showed that linear models perform comparably with SOTA CNN on VBM data. We also demonstrated that DenseNet and tiny-DenseNet, a lighter version that we proposed, provide a good compromise in terms of performance in all data regime. Therefore, we suggest to employ them as the architectures by default. Critically, we also showed that current CNN are still very biased towards the acquisition site, even when trained with N=10k multi-site images. In this context, VBM pre-processing provides an efficient way to limit this site effect. Surprisingly, we did not find any clear benefit from data augmentation techniques. Finally, we proved that deep ensemble learning is well suited to re-calibrate big CNN models without sacrificing performance.
翻译:深度学习(DL)和CNN模式已成为一系列广泛视觉任务、优于传统机器学习(ML)方法的实用方法。 因此,它们吸引了神经成像领域的大量关注, 特别是苯型预测或计算机辅助诊断。 然而, 目前的研究大多涉及小型单点群群, 以及特定的预处理管道和定制CNN结构, 这使得它们难以与之比较。 我们提出了最新的最先进(SOTA) 3D CN模式的广泛基准, 也评估了数据增强和深合体学习的好处。 因此, 它们吸引了对神经成像领域的大量关注, 尤其是对苯基模型(VBM)的预处理或准光学诊断。 然而, 多数目前的实验往往涉及一个大型的多点3D大脑解剖式 MRI 数据集, 包括N=10k 的预处理前管道和定制的CNN CNN 结构, 使得它们难以进行比较。 我们发现所有模型提供的VBM(S) 图像都比准原始数据要好得多。 这发现, 当我们通过训练的方式进进一个10 IMIS(VBA) 样的模型, 数据采集的模型展示了我们展示了一个新的数据结构, 我们的模型, 我们的模型展示了甚低点的模型, 也展示了甚甚小点数据。