Canonical Correlation Analysis (CCA) and its regularised versions have been widely used in the neuroimaging community to uncover multivariate associations between two data modalities (e.g., brain imaging and behaviour). However, these methods have inherent limitations: (1) statistical inferences about the associations are often not robust; (2) the associations within each data modality are not modelled; (3) missing values need to be imputed or removed. Group Factor Analysis (GFA) is a hierarchical model that addresses the first two limitations by providing Bayesian inference and modelling modality-specific associations. Here, we propose an extension of GFA that handles missing data, and highlight that GFA can be used as a predictive model. We applied GFA to synthetic and real data consisting of brain connectivity and non-imaging measures from the Human Connectome Project (HCP). In synthetic data, GFA uncovered the underlying shared and specific factors and predicted correctly the non-observed data modalities in complete and incomplete data sets. In the HCP data, we identified four relevant shared factors, capturing associations between mood, alcohol and drug use, cognition, demographics and psychopathological measures and the default mode, frontoparietal control, dorsal and ventral networks and insula, as well as two factors describing associations within brain connectivity. In addition, GFA predicted a set of non-imaging measures from brain connectivity. These findings were consistent in complete and incomplete data sets, and replicated previous findings in the literature. GFA is a promising tool that can be used to uncover associations between and within multiple data modalities in benchmark datasets (such as, HCP), and easily extended to more complex models to solve more challenging tasks.
翻译:在神经成像界广泛使用卡尼氏关系分析(CCA)及其常规版本,以发现两种数据模式(如脑成像和行为)之间的多变关联。然而,这些方法具有内在的局限性:(1) 有关这些关联的统计推论往往不健全;(2) 每种数据模式中的关联没有模型化;(3) 缺失的值需要估算或删除; 群体因子分析(GFA)是一个等级模式,通过提供贝耶斯的复制推论和模型模式特定协会,解决前两个局限性。在这里,我们提议扩大全球财务协定,处理缺失的数据,并强调全球财务协定可用作预测模型。我们将全球财务协定应用于综合和真实数据数据数据,包括人类连接项目(HCP)中的大脑连接和非映像措施。 在合成数据中,全球财务协定发现了基本和不完整数据集中未观测到的数据模式。 (在高CPCM数据中,我们很容易发现四个相关的共同因素,在情绪、酒精和药物使用之间,以及强调全球财务协定中GFA中不完全的模型、人口和神经结构中,作为前期数据和默认数据结构中的一种指标,这些工具中的一种最新指标和默认指标系中的一种指标系,这些指标和默认指标系的更新和默认指标系的测量系是更精确的统计结构。