We consider the problem of estimating high-dimensional covariance matrices of $K$-populations or classes in the setting where the sample sizes are comparable to the data dimension. We propose estimating each class covariance matrix as a distinct linear combination of all class sample covariance matrices. This approach is shown to reduce the estimation error when the sample sizes are limited, and the true class covariance matrices share a somewhat similar structure. We develop an effective method for estimating the coefficients in the linear combination that minimize the mean squared error under the general assumption that the samples are drawn from (unspecified) elliptically symmetric distributions possessing finite fourth-order moments. To this end, we utilize the spatial sign covariance matrix, which we show (under rather general conditions) to be an asymptotically unbiased estimator of the normalized covariance matrix as the dimension grows to infinity. We also show how the proposed method can be used in choosing the regularization parameters for multiple target matrices in a single class covariance matrix estimation problem. We assess the proposed method via numerical simulation studies including an application in global minimum variance portfolio optimization using real stock data.
翻译:我们考虑了在样本大小与数据维度可比的情况下估算高维共变基质(K$-人口或类别)的问题。我们建议估算每类共变基质,作为所有类别样本共变基质的明显线性组合。在抽样规模有限的情况下,这一方法可以减少估计错误,而真正的类共变基质有着某种相似的结构。我们开发了一种有效的方法来估算线性组合中的系数,以在一般假设下将平均平方差差最小化。根据一般假设,样本是从(未指定)具有有限四级时的静态对称分布中提取的。我们为此使用空间符号共变基质矩阵,我们(在相当一般的条件下)显示,这是随着维度的扩大,对正常的共变基质矩阵进行无差别估计。我们还展示了如何使用拟议方法来选择单级共变基质矩阵中多个目标基质的正规化参数。我们通过数字模拟研究评估了拟议方法,包括使用实际存量数据在全球最低差异组合中应用。