The deviation between chronological age and age predicted from neuroimaging data has been identified as a sensitive risk-marker of cross-disorder brain changes, growing into a cornerstone of biological age-research. However, Machine Learning models underlying the field do not consider uncertainty, thereby confounding results with training data density and variability. Also, existing models are commonly based on homogeneous training sets, often not independently validated, and cannot be shared due to data protection issues. Here, we introduce an uncertainty-aware, shareable, and transparent Monte-Carlo Dropout Composite-Quantile-Regression (MCCQR) Neural Network trained on N=10,691 datasets from the German National Cohort. The MCCQR model provides robust, distribution-free uncertainty quantification in high-dimensional neuroimaging data, achieving lower error rates compared to existing models across ten recruitment centers and in three independent validation samples (N=4,004). In two examples, we demonstrate that it prevents spurious associations and increases power to detect accelerated brain-aging. We make the pre-trained model publicly available.
翻译:神经成像数据预测的时间年龄与年龄之间的偏差被确定为跨疾病大脑变化的敏感风险标志,逐渐成为生物年龄研究的基石;然而,该领域的机械学习模型并不考虑不确定性,从而混淆了培训数据密度和变异性的结果;此外,现有模型通常以同质培训成套培训为基础,往往不独立验证,由于数据保护问题而无法共享。在这里,我们引入了一个具有不确定性、可分享性和透明的蒙特-卡洛脱落复合量-量子回归网络(MCCQR)神经网络(MCCQR),该网络接受德国国家库尔特10 691数据集的培训。MCCQR模型在高度神经成型数据中提供了强健、无分布的不确定性量化,与十个招聘中心和三个独立验证样本(N=4 004)的现有模型相比,误差率较低。在两个实例中,我们证明它防止了可疑的关联,并增加了探测加速大脑发育的能力。我们公开提供了经过培训的模型。