In this work we solve the problem of robustly learning a high-dimensional Gaussian mixture model with $k$ components from $\epsilon$-corrupted samples up to accuracy $\widetilde{O}(\epsilon)$ in total variation distance for any constant $k$ and with mild assumptions on the mixture. This robustness guarantee is optimal up to polylogarithmic factors. The main challenge is that most earlier works rely on learning individual components in the mixture, but this is impossible in our setting, at least for the types of strong robustness guarantees we are aiming for. Instead we introduce a new framework which we call {\em strong observability} that gives us a route to circumvent this obstacle.
翻译:在这项工作中,我们解决了强力学习高斯高斯的高维混合物模型的问题,高斯混合模型的成分由美元(epsilon)和美元(epsilon)组成,从美元(cropped)到准确的美元(equal)和全方位(o)到任何恒定的美元(o)和对混合物的微量假设的完全变异距离。这种强力保证与多元因素相比是最佳的。主要挑战是,大多数早期的工程都依赖于学习混合物中的个别成分,但在我们的设置中,这是不可能的,至少对于我们所追求的强力保证类型来说是如此。相反,我们引入了一种新框架,我们称之为“强烈的可观察性 ”, 使我们有一个绕过这一障碍的路径。