Simultaneous analysis of gene expression data and genetic variants is highly of interest, especially when the number of gene expressions and genetic variants are both greater than the sample size. Association of both causal genes and effective SNPs makes the use of sparse modeling of such genetic data sets, highly important. The high-dimensional sparse instrumental variables models are one of such useful association models, which models the simultaneous relation of the gene expressions and genetic variants with complex traits. From a Bayesian viewpoint, the sparsity can be favored using sparsity-enforcing priors such as spike-and-slab priors. A two-stage modification of the expectation propagation (EP) algorithm is proposed and examined for approximate inference in high-dimensional sparse instrumental variables models with spike-and-slab priors. This method is an adoption of the classical two-stage least squares method, to be used with the Bayes context. A simulation study is performed to examine the performance of the methods. The proposed method is applied to analysis of the mouse obesity data.
翻译:同时对基因表达数据和遗传变异进行同步分析非常令人感兴趣,特别是当基因表达和遗传变异的数量都大于抽样规模时更是如此。因果基因和有效SNP的结合使这类基因数据集的模型稀少,非常重要。高维稀释工具变量模型是这种有用的联系模型之一,这种模型模拟基因表达和遗传变异与复杂特性的同步关系。从巴伊西亚的观点来看,可以使用螺旋杆和粘土前置等孔径推进前置法来研究孔径。建议对预期传播(EP)算法进行两阶段的修改,并检查高维稀变异工具模型中与钉杆前的大致推断。这种方法是采用古典的两阶段最小方法,与Bayes环境结合使用。进行模拟研究是为了研究方法的性能。拟采用的方法用于分析鼠标肥胖数据。