Given the complexity of genetic risk prediction, there is a critical need for the development of novel methodologies that can effectively capture intricate genotype--phenotype relationships (e.g., nonlinear) while remaining statistically interpretable and computationally tractable. We develop a Neural Tangent Kernel (NTK) framework to integrate kernel methods into deep neural networks for genetic risk prediction analysis. We consider two approaches: NTK-LMM, which embeds the empirical NTK in a linear mixed model with variance components estimated via minimum quadratic unbiased estimator (MINQUE), and NTK-KRR, which performs kernel ridge regression with cross-validated regularization. Through simulation studies, we show that NTK-based models outperform the traditional neural network models and linear mixed models. By applying NTK to endophenotypes (e.g., hippocampal volume) and AD-related genes (e.g., APOE) from Alzheimer's Disease Neuroimaging Initiative (ADNI), we found that NTK achieved higher accuracy than existing methods for hippocampal volume and entorhinal cortex thickness. In addition to its accuracy performance, NTK has favorable optimization properties (i.e., having a closed-form or convex training) and generates interpretable results due to its connection to variance components and heritability. Overall, our results indicate that by integrating the strengths of both deep neural networks and kernel methods, NTK offers competitive performance for genetic risk prediction analysis while having the advantages of interpretability and computational efficiency.
翻译:暂无翻译