Data on high-dimensional spheres arise frequently in many disciplines either naturally or as a consequence of preliminary processing and can have intricate dependence structure that needs to be understood. We develop exploratory factor analysis of the projected normal distribution to explain the variability in such data using a few easily interpreted latent factors. Our methodology provides maximum likelihood estimates through a novel fast alternating expectation profile conditional maximization algorithm. Results on simulation experiments on a wide range of settings are uniformly excellent. Our methodology provides interpretable and insightful results when applied to tweets with the $\#MeToo$ hashtag in early December 2018, to time-course functional Magnetic Resonance Images of the average pre-teen brain at rest, to characterize handwritten digits, and to gene expression data from cancerous cells in the Cancer Genome Atlas.
翻译:在许多学科中,无论是自然的还是初步处理的结果,都经常产生关于高维范围的数据,而且可能具有需要理解的复杂依赖结构。我们利用一些容易解释的潜在因素,对预测的正常分布进行探索性因素分析,以解释这些数据的变异性。我们的方法通过一种新的快速交替预期剖面的有条件最大化算法提供了最大的可能性估计。在各种环境中进行的模拟实验的结果都是极好的。我们的方法在2018年12月初用$@MeToo$标签对推文进行解释和洞察时,提供了可解释和有洞察力的结果。我们的方法是:在2018年12月初用“$@MeTooo$”标签对推文进行解释和有洞察力的结果,在休息时道上对平均青少年前大脑的功能性磁共振图像、手写数字特征以及癌症基因组图中癌症细胞的基因表达数据进行描述。