A fundamental concept in multivariate statistics, sample correlation matrix, is often used to infer the correlation/dependence structure among random variables, when the population mean and covariance are unknown. A natural block extension of it, {\it sample block correlation matrix}, is proposed to take on the same role, when random variables are generalized to random sub-vectors. In this paper, we establish a spectral theory of the sample block correlation matrices and apply it to group independent test and related problem, under the high-dimensional setting. More specifically, we consider a random vector of dimension $p$, consisting of $k$ sub-vectors of dimension $p_t$'s, where $p_t$'s can vary from $1$ to order $p$. Our primary goal is to investigate the dependence of the $k$ sub-vectors. We construct a random matrix model called sample block correlation matrix based on $n$ samples for this purpose. The spectral statistics of the sample block correlation matrix include the classical Wilks' statistic and Schott's statistic as special cases. It turns out that the spectral statistics do not depend on the unknown population mean and covariance. Further, under the null hypothesis that the sub-vectors are independent, the limiting behavior of the spectral statistics can be described with the aid of the Free Probability Theory. Specifically, under three different settings of possibly $n$-dependent $k$ and $p_t$'s, we show that the empirical spectral distribution of the sample block correlation matrix converges to the free Poisson binomial distribution, free Poisson distribution (Marchenko-Pastur law) and free Gaussian distribution (semicircle law), respectively. We then further derive the CLTs for the linear spectral statistics of the block correlation matrix under general setting.
翻译:多变量统计中的基本概念,即抽样相关矩阵,通常用于推断随机变量之间的关联/依赖结构。当人口平均值和共差未知时,通常使用多变量的基本概念来推断随机变量之间的关联/依赖结构。建议其自然块延伸值,即 {it 样块关联矩阵},在随机变量向随机子矢量分布时,建议发挥同样的作用,当随机变量向随机子矢量扩散时。在本文中,我们建立一个样块关联矩阵的光谱理论,并在高维值设置下,将其应用于群体独立测试和相关问题。更具体地说,我们考虑的是,一个维度的直线矢量矢量的随机矢量矢量,由维度的美元次量次量次量的次量次量分量组成。我们的主要目标是调查美元次量变量的依赖性。我们构建一个随机矩阵模型,以美元样本块关联矩阵为基础,用于此目的的基点的基质量测试,包括古典 Wilks'统计和Schot't 统计作为特殊案例。它显示,光谱序列统计的直流流流流流流流流流值分配值分配值数据可能不依依依依依次量统计,而依次量的基数度统计, 直系的基点统计。