强力高层次期望通过特里曼硬推进器实现最大化 (Robust High Dimensional Expectation Maximization Algorithm via Trimmed Hard Thresholding)

In this paper, we study the problem of estimating latent variable models with arbitrarily corrupted samples in high dimensional space ({\em i.e.,} $d\gg n$) where the underlying parameter is assumed to be sparse. Specifically, we propose a method called Trimmed (Gradient) Expectation Maximization which adds a trimming gradients step and a hard thresholding step to the Expectation step (E-step) and the Maximization step (M-step), respectively. We show that under some mild assumptions and with an appropriate initialization, the algorithm is corruption-proofing and converges to the (near) optimal statistical rate geometrically when the fraction of the corrupted samples $\epsilon$ is bounded by $ \tilde{O}(\frac{1}{\sqrt{n}})$. Moreover, we apply our general framework to three canonical models: mixture of Gaussians, mixture of regressions and linear regression with missing covariates. Our theory is supported by thorough numerical results.

翻译：在本文中,我们研究了在高维空间( 也就是说, $d\gg n$) 任意腐蚀的样本中估计潜在变量模型的问题。具体地说, 我们提出了一种叫Trimmed( 大) 期望最大化的方法, 这种方法分别对期望步骤( E- 步) 和最大化步骤( M- 步) 添加了三重梯度梯度步骤和硬阈值步骤。我们表明, 在一些温和的假设和适当的初始化下, 算法是防止腐败的, 并在( 近于) 最佳统计率的几何上与( ) 最佳统计率趋同。当腐蚀样品中, $\ epsilon 部分被 $\ tilde{ O} (\ frac{ 1unsqrt{ n_ } (\\\\\\\\\ \ \ ) 。此外, 我们把我们的总框架应用于三种卡通模型: 高斯人混合物、回归和线性回归与缺失的计算结果的支持。我们的理论得到彻底的数字支持。