It can be challenging to perform an integrative statistical analysis of multi-view high-dimensional data acquired from different experiments on each subject who participated in a joint study. Canonical Correlation Analysis (CCA) is a statistical procedure for identifying relationships between such data sets. In that context, Structured Sparse CCA (ScSCCA) is a rapidly emerging methodological area that aims for robust modeling of the interrelations between the different data modalities by assuming the corresponding CCA directional vectors to be sparse. Although it is a rapidly growing area of statistical methodology development, there is a need for developing related methodologies in the Bayesian paradigm. In this manuscript, we propose a novel ScSCCA approach where we employ a Bayesian infinite factor model and aim to achieve robust estimation by encouraging sparsity in two different levels of the modeling framework. Firstly, we utilize a multiplicative Half-Cauchy process prior to encourage sparsity at the level of the latent variable loading matrices. Additionally, we promote further sparsity in the covariance matrix by using graphical horseshoe prior or diagonal structure. We conduct multiple simulations to compare the performance of the proposed method with that of other frequently used CCA procedures, and we apply the developed procedures to analyze multi-omics data arising from a breast cancer study.
翻译:暂无翻译