This paper presents and compares alternative transfer learning methods that can increase the power of conditional testing via knockoffs by leveraging prior information in external data sets collected from different populations or measuring related outcomes. The relevance of this methodology is explored in particular within the context of genome-wide association studies, where it can be helpful to address the pressing need for principled ways to suitably account for, and efficiently learn from the genetic variation associated to diverse ancestries. Finally, we apply these methods to analyze several phenotypes in the UK Biobank data set, demonstrating that transfer learning helps knockoffs discover more numerous associations in the data collected from minority populations, potentially opening the way to the development of more accurate polygenic risk scores.
翻译:本文件介绍并比较了其他转让学习方法,这些方法通过利用从不同人群收集的外部数据集中的先前信息或衡量相关结果,可以提高通过取舍进行有条件测试的力量。这一方法的相关性特别在全基因组协会研究的范围内加以探讨,因为研究有助于解决迫切需要以原则性方法适当说明和有效学习与不同物种有关的遗传变异。最后,我们运用这些方法分析联合王国生物银行数据集中的若干种人种,表明转移学习有助于发现从少数群体群体收集的数据中发现更多的人种,有可能为制定更准确的多源风险分数开辟道路。