Motivation: In spite of great success of genome-wide association studies (GWAS), multiple challenges still remain. First, complex traits are often associated with many single nucleotide polymorphisms (SNPs), each with small or moderate effect sizes. Second, our understanding of the functional mechanisms through which genetic variants are associated with complex traits is still limited. To address these challenges, we propose GPA-Tree and it simultaneously implements association mapping and identifies key combinations of functional annotations related to risk-associated SNPs by combining a decision tree algorithm with a hierarchical modeling framework. Results: First, we implemented simulation studies to evaluate the proposed GPA-Tree method and compared its performance with existing statistical approaches. The results indicate that GPA-Tree outperforms existing statistical approaches in detecting risk-associated SNPs and identifying the true combinations of functional annotations with high accuracy. Second, we applied GPA-Tree to a systemic lupus erythematosus (SLE) GWAS and functional annotation data including GenoSkyline and GenoSkylinePlus. The results from GPA-Tree highlight the dysregulation of blood immune cells, including but not limited to primary B, memory helper T, regulatory T, neutrophils and CD8+ memory T cells in SLE. These results demonstrate that GPA-Tree can be a powerful tool that improves association mapping while facilitating understanding of the underlying genetic architecture of complex traits and potential mechanisms linking risk-associated SNPs with complex traits.
翻译:动机:尽管全基因组协会研究取得了巨大成功,但依然存在着多种挑战。首先,复杂的特征往往与许多单核糖酸多元形态(SNPs)有关,每个单一核酸多形态(SNPs)的大小较小或中度效应大小。第二,我们对基因变异与复杂特性相联系的功能机制的理解仍然有限。为了应对这些挑战,我们建议GPA-Tree(GPA-Tree)进行关联绘图,并同时通过将决策树算法与等级模型框架相结合,确定与风险相关的SNPs相关的功能说明的关键组合。结果:首先,我们进行了模拟研究,以评价拟议的GPA-TRe(SNPs)方法,并将其业绩与现有统计方法进行比较。结果表明,GPA-T(GA-T)的精度精度和精度模型(GENS-PLE)的精度分析结果,而GO-T(GO-NB)的精度(S-NLE)的精度和精度(GO-NLA-NLE)的精度(S-Ral-National-National-Ral-Lislational-Lis)的精度(S-Lislislational-Lislational)的精度(S-Lismal-Lislislislisl)的精度(S-Lisl)的精度(S-S-Lislislislisl)的精度、S-S-Lislational-Lislational-Lisldal-Lislisal-Lisal-Lisal-Lisal-Lismal-S-Lismal-Lislisal-Lismal-Lismal-Lismal-Lislismal-Lismal-Lismal-Lismal)和S-Lislal-Lisal-Lismal-Lislismal-Lismal-Lis)的精度,以及S-S-S-S-S-S-S-S-S-Lismal-S-S-Lislismal-S