We elucidate the problem of estimating large-dimensional covariance matrices in the presence of correlations between samples. To this end, we generalize the Marcenko-Pastur equation and the Ledoit-Peche shrinkage estimator using methods of random matrix theory and free probability. We develop an efficient algorithm that implements the corresponding analytic formulas, based on the Ledoit-Wolf kernel estimation technique. We also provide an associated open-source Python library, called "shrinkage", with a user-friendly API to assist in practical tasks of estimation of large covariance matrices. We present an example of its usage for synthetic data generated according to exponentially-decaying auto-correlations.
翻译:我们阐明了在样品相互关联的情况下估计大维变量矩阵的问题。为此,我们利用随机矩阵理论和自由概率的方法,对Marcenko-Pastur方程式和Ledoit-Peche缩水估计仪进行了普及。我们开发了一种高效算法,根据Ledoit-Wolf内核估计技术,执行相应的分析公式。我们还提供了一个相关的开放源Python图书馆,称为“Shrinkingage”,并配有一个方便用户的API,以协助估算大型变量矩阵的实际任务。我们举了一个实例,说明它用于根据指数下降的自动反动关系生成的合成数据。