We provide sparse principal loading analysis which is a new concept that reduces dimensionality of cross sectional data and identifies the underlying covariance structure. Sparse principal loading analysis selects a subset of existing variables for dimensionality reduction while variables that have a small distorting effect on the covariance matrix are discarded. Therefore, we show how to detect these variables and provide methods to assess their magnitude of distortion. Sparse principal loading analysis is twofold and can also identify the underlying block diagonal covariance structure using sparse loadings. This is a new approach in this context and we provide required criteria to evaluate if the found block-structure fits the sample. The method uses sparse loadings rather than eigenvectors to decompose the covariance matrix which can result in a large loss of information if the loadings of choice are too sparse. However, we show that this is no concern in our new concept because sparseness is controlled by the aforementioned evaluation criteria. Further, we show the advantages of sparse principal loading analysis both in the context of variable selection and covariance structure detection, and illustrate the performance of the method with simulations and on real datasets. Supplementary material for this article is available online.
翻译:我们提供了稀少的主要装载分析,这是一个新概念,可以减少跨区段数据的维度,并查明潜在的共差结构。粗略的主要装载分析选择了现有减少维度变数的子集,而忽略了对共差矩阵有小扭曲影响的变数。因此,我们展示了如何检测这些变数,并提供了评估其扭曲程度的方法。粗略的主要装载分析是双重的,还可以使用稀薄的装载来确定基本区块的对角共差结构。这是这方面的一种新方法,我们提供了评估发现区块结构是否适合样本的标准。这种方法使用稀疏的装载,而不是精子处理器来拆解分解共差矩阵,如果选择的负荷过少,则可能导致大量信息流失。然而,我们表明,由于上述评价标准控制着稀少性,所以我们的新概念没有考虑到这一点。此外,我们展示了在变量选择和共差结构检测中缺乏的原始装数分析的优点,我们提供了以模拟和真实数据集的方式在网上展示。