Recent developments in regularized Canonical Correlation Analysis (CCA) promise powerful methods for high-dimensional, multiview data analysis. However, justifying the structural assumptions behind many popular approaches remains a challenge, and features of realistic biological datasets pose practical difficulties that are seldom discussed. We propose a novel CCA estimator rooted in an assumption of conditional independencies and based on the Graphical Lasso. Our method has desirable theoretical guarantees and good empirical performance, demonstrated through extensive simulations and real-world biological datasets. Recognizing the difficulties of model selection in high dimensions and other practical challenges of applying CCA in real-world settings, we introduce a novel framework for evaluating and interpreting regularized CCA models in the context of Exploratory Data Analysis (EDA), which we hope will empower researchers and pave the way for wider adoption.
翻译:暂无翻译