The assumption of independent subvectors arises in many aspects of multivariate analysis. In most real-world applications, however, we lack prior knowledge about the number of subvectors and the specific variables within each subvector. Yet, testing all these combinations is not feasible. For example, for a data matrix containing 15 variables, there are already 1 382 958 545 possible combinations. Given that zero correlation is a necessary condition for independence, independent subvectors exhibit a block diagonal covariance matrix. This paper focuses on the detection of such block diagonal covariance structures in high-dimensional data and therefore also identifies uncorrelated subvectors. Our nonparametric approach exploits the fact that the structure of the covariance matrix is mirrored by the structure of its eigenvectors. However, the true block diagonal structure is masked by noise in the sample case. To address this problem, we propose to use sparse approximations of the sample eigenvectors to reveal the sparse structure of the population eigenvectors. Notably, the right singular vectors of a data matrix with an overall mean of zero are identical to the sample eigenvectors of its covariance matrix. Using sparse approximations of these singular vectors instead of the eigenvectors makes the estimation of the covariance matrix obsolete. We demonstrate the performance of our method through simulations and provide real data examples. Supplementary materials for this article are available online.
翻译:暂无翻译