Gaussian Mixture Models are a powerful tool in Data Science and Statistics that are mainly used for clustering and density approximation. The task of estimating the model parameters is in practice often solved by the Expectation Maximization (EM) algorithm which has its benefits in its simplicity and low per-iteration costs. However, the EM converges slowly if there is a large share of hidden information or overlapping clusters. Recent advances in manifold optimization for Gaussian Mixture Models have gained increasing interest. We introduce an explicit formula for the Riemannian Hessian for Gaussian Mixture Models. On top, we propose a new Riemannian Newton Trust-Region method which outperforms current approaches both in terms of runtime and number of iterations. We apply our method on clustering problems and density approximation tasks. Our method is very powerful for data with a large share of hidden information compared to existing methods.
翻译:Gausian Mixture 模型是数据科学和统计的有力工具,主要用于组群和密度近似。估计模型参数的任务在实践中往往通过期望最大化算法来解决,该算法的好处在于其简单性和低的每平面成本。然而,如果隐藏信息或重叠的组群比例较大,则EM会缓慢地聚合。Gausian Mixture 模型的多重优化最近的进展已引起越来越多的兴趣。我们为Gaussian Mixtures 模型引入了Riemannian Hessian 引入了一个清晰的公式。最重要的是,我们提出了一种新的Riemannian Newton Trust-Region 方法,该方法在运行时间和迭代数方面都比当前的方法大。我们在聚合问题和密度近似比任务方面运用了我们的方法。我们的方法在数据上非常强大,与现有方法相比,隐藏信息的比例很大。