This paper tackles the problem of robust covariance matrix estimation when the data is incomplete. Classical statistical estimation methodologies are usually built upon the Gaussian assumption, whereas existing robust estimation ones assume unstructured signal models. The former can be inaccurate in real-world data sets in which heterogeneity causes heavy-tail distributions, while the latter does not profit from the usual low-rank structure of the signal. Taking advantage of both worlds, a covariance matrix estimation procedure is designed on a robust (compound Gaussian) low-rank model by leveraging the observed-data likelihood function within an expectation-maximization algorithm. It is also designed to handle general pattern of missing values. The proposed procedure is first validated on simulated data sets. Then, its interest for classification and clustering applications is assessed on two real data sets with missing values, which include multispectral and hyperspectral time series.
翻译:本文处理数据不完整时稳健共变矩阵估算问题。典型统计估算方法通常以高斯假设为基础,而现有稳健估算方法则采用非结构化的信号模型。前者在现实世界数据集中可能不准确,在这种数据集中,异质性会导致重尾分布,而后者则不能从通常的信号低级别结构中获益。利用这两个世界,利用一个稳健的(复合高斯)低级别模型设计了共变矩阵估算程序,在预期-最大化算法中利用观察到的数据概率功能。该方法还旨在处理缺失值的一般模式。拟议的程序首先在模拟数据集中验证。然后,对分类和集群应用的兴趣在两个缺少值的真实数据集上进行评估,其中包括多光谱和超光谱时间序列。