Obtaining an accurate estimate of the underlying covariance matrix from finite sample size data is challenging due to sample size noise. In recent years, sophisticated covariance-cleaning techniques based on random matrix theory have been proposed to address this issue. Most of these methods aim to achieve an optimal covariance matrix estimator by minimizing the Frobenius norm distance as a measure of the discrepancy between the true covariance matrix and the estimator. However, this practice offers limited interpretability in terms of information theory. To better understand this relationship, we focus on the Kullback-Leibler divergence to quantify the information lost by the estimator. Our analysis centers on rotationally invariant estimators, which are state-of-art in random matrix theory, and we derive an analytical expression for their Kullback-Leibler divergence. Due to the intricate nature of the calculations, we use genetic programming regressors paired with human intuition. Ultimately, using this approach, we formulate a conjecture validated through extensive simulations, showing that the Frobenius distance corresponds to a first-order expansion term of the Kullback-Leibler divergence, thus establishing a more defined link between the two measures.
翻译:暂无翻译