We provide sparse principal loading analysis which is a new concept that reduces dimensionality of cross sectional data and identifies the underlying covariance structure. Sparse principal loading analysis selects a subset of the existing variables for dimensionality reduction while variables that have a small distorting effect on the covariance matrix are discarded. Therefore, we show how to detect those variables and provide methods to assess their magnitude of distortion. Sparse principal loading analysis is twofold and can also identify the underlying block diagonal covariance structure. In this context, we provide required criteria to evaluate if the found block-structure fits the sample. The method is based on sparse loadings rather than eigenvectors which can result in a large loss of information if the loadings of choice are too sparse. However, we show that this is no concern in our new concept because we control sparseness by the aforementioned evaluation criteria. Further, we show the advantages of sparse principal loading analysis in contrast to principal loading analysis and illustrate the performance of the method on simulated data and on real datasets. Supplementary materials for this article are available online.
翻译:我们提供了稀少的主要装载分析,这是一个新概念,可以减少跨区段数据的维度,并查明潜在的共差结构。粗略的主要装载分析选择了现有减少维度变数的子集,而忽略了对共差矩阵有小扭曲影响的变数。因此,我们展示了如何检测这些变数,并提供了评估其扭曲程度的方法。粗略的主要装载分析是双重的,还可以确定基本区块的对等差差结构。在这方面,我们提供了评估发现块结构是否适合样本的标准。这种方法基于稀疏的装载,而不是根据精子计算,如果选择的负荷太少,可能会导致大量信息丢失。然而,我们表明这与我们的新概念无关,因为我们根据上述评估标准控制了稀疏的主要装载分析,我们展示了与主要装货分析相对的分散的主要装货分析的优点,并展示了模拟数据和真实数据集的方法的性能。这一文章的补充材料可在线查阅。