Tyler's and Maronna's M-estimators, as well as their regularized variants, are popular robust methods to estimate the scatter or covariance matrix of a multivariate distribution. In this work, we study the non-asymptotic behavior of these estimators, for data sampled from a distribution that satisfies one of the following properties: 1) independent sub-Gaussian entries, up to a linear transformation; 2) log-concave distributions; 3) distributions satisfying a convex concentration property. Our main contribution is the derivation of tight non-asymptotic concentration bounds of these M-estimators around a suitably scaled version of the data sample covariance matrix. Prior to our work, non-asymptotic bounds were derived only for Elliptical and Gaussian distributions. Our proof uses a variety of tools from non asymptotic random matrix theory and high dimensional geometry. Finally, we illustrate the utility of our results on two examples of practical interest: sparse covariance and sparse precision matrix estimation.
翻译:泰勒(Tyler)和马罗纳(Maronna)的M估计量及其正则化变种是估计多元分布散布或协方差矩阵的常用鲁棒方法。本文研究这些估计量的非渐进行为,针对从满足以下特性的分布中采样的数据:1)独立亚高斯项,经过线性变换;2)对数凹分布;3)满足凸集中性质的分布。我们的主要贡献是在一定比例的数据样本协方差矩阵周围推导出这些M估计量的紧密非漏刻度集中界。在我们的研究之前,只有对椭圆和高斯分布推导了非渐进集中边界。我们使用非渐进随机矩阵论和高维几何的各种工具进行证明。最后,我们通过两个实际例子展示了我们结果的实用性:稀疏协方差和稀疏精度矩阵估计。