Estimating the sharing of genetic effects across different conditions is important to many statistical analyses of genomic data. The patterns of sharing arising from these data are often highly heterogeneous. To flexibly model these heterogeneous sharing patterns, Urbut et al. (2019) proposed the multivariate adaptive shrinkage (MASH) method to jointly analyze genetic effects across multiple conditions. However, multivariate analyses using MASH (as well as other multivariate analyses) require good estimates of the sharing patterns, and estimating these patterns efficiently and accurately remains challenging. Here we describe new empirical Bayes methods that provide improvements in speed and accuracy over existing methods. The two key ideas are: (1) adaptive regularization to improve accuracy in settings with many conditions; (2) improving the speed of the model fitting algorithms by exploiting analytical results on covariance estimation. In simulations, we show that the new methods provide better model fits, better out-of-sample performance, and improved power and accuracy in detecting the true underlying signals. In an analysis of eQTLs in 49 human tissues, our new analysis pipeline achieves better model fits and better out-of-sample performance than the existing MASH analysis pipeline. We have implemented the new methods, which we call ``Ultimate Deconvolution'', in an R package, udr, available on GitHub.
翻译:暂无翻译