We consider the problem of estimating high-dimensional covariance matrices of $K$-populations or classes in the setting where the samples sizes are comparable to the data dimension. We propose estimating each class covariance matrix as a distinct linear combination of all class sample covariance matrices. This approach is shown to reduce the estimation error when the sample sizes are limited, and the true class covariance matrices share a somewhat similar structure. We develop an effective method for estimating the coefficients in the linear combination that minimize the mean squared error under the general assumption that the samples are drawn from (unspecified) elliptically symmetric distributions possessing finite fourth-order moments. To this end, we utilize the spatial sign covariance matrix, which we show (under rather general conditions) to be an unbiased estimator of the normalized covariance matrix as the dimension grows to infinity. We also show how the proposed method can be used in choosing the regularization parameters for multiple target matrices in a single class covariance matrix estimation problem. We assess the proposed method via numerical simulation studies including an application in global minimum variance portfolio optimization using real stock data.
翻译:我们考虑了在样本大小与数据维度可比的环境下估算高维共变基质(K$-人口或类别)的问题。我们建议估算每类共变基质,作为所有类别样本共变基质的不同的线性组合。在样本规模有限的情况下,这一方法被证明可以减少估计错误,而真正的类共变基质有着某种相似的结构。我们开发了一种有效的方法来估算线性组合中的系数,以最大限度地减少在样本取自(未指定)具有有限第四级时段的等式分布的一般假设下,平均平方差差。我们为此利用空间标志共变基质矩阵,我们(在相当一般的条件下)显示,这是随着尺寸的扩大而成为常态共变基质矩阵的一个公正的估计符。我们还展示了如何使用拟议方法来选择单级共变基质矩阵中多个目标基质的正规参数。我们通过数字模拟研究来评估拟议的方法,包括使用实际存量数据在全球最低差异组合优化的应用。