Rare variants are hypothesized to be largely responsible for heritability and susceptibility to disease in humans. So rare variants association studies hold promise for understanding disease. Conversely though, the rareness of the variants poses practical challenges; since these variants are present in few individuals, it can be difficult to develop data-collection and statistical methods that effectively leverage their sparse information. In this work, we develop a novel Bayesian nonparametric model to capture how design choices in rare variants association studies can impact their usefulness. We then show how to use our model to guide design choices under a fixed experimental budget in practice. In particular, we provide a practical workflow and illustrative experiments on simulated data.
翻译:稀有的变异体被假定为对人类的遗传性和易感染疾病负主要责任。因此,稀有的变异体协会研究为了解疾病带来了希望。相反,变异体的稀有性带来了实际挑战;由于这些变异体存在于少数个人,因此很难制定数据收集和统计方法来有效地利用其稀有信息。在这项工作中,我们开发了一个新颖的巴伊西亚非对称模型,以了解稀有变异体协会研究的设计选择如何影响其效用。然后我们展示如何利用我们的模型来指导在固定实验预算下的实际设计选择。特别是,我们提供了模拟数据的实际工作流程和说明性实验。