高斯混合模型中的局部极小结构 (Local Minima Structures in Gaussian Mixture Models)

We investigate the landscape of the negative log-likelihood function of Gaussian Mixture Models (GMMs) with a general number of components in the population limit. As the objective function is non-convex, there can be multiple local minima that are not globally optimal, even for well-separated mixture models. Our study reveals that all local minima share a common structure that partially identifies the cluster centers (i.e., means of the Gaussian components) of the true location mixture. Specifically, each local minimum can be represented as a non-overlapping combination of two types of sub-configurations: fitting a single mean estimate to multiple Gaussian components or fitting multiple estimates to a single true component. These results apply to settings where the true mixture components satisfy a certain separation condition, and are valid even when the number of components is over- or under-specified. We also present a more fine-grained analysis for the setting of one-dimensional GMMs with three components, which provide sharper approximation error bounds with improved dependence on the separation.

翻译：我们在人口上限制下调查高斯混合模型（GMM）的负对数似然函数的局部极小函数。由于目标函数是非凸的，即使对于分离良好的混合模型，也可能存在多个不是全局最优的局部极小值。我们的研究揭示了所有局部极小值共享的常见结构，该结构部分识别了真实位置混合物的集群中心（即高斯组分的均值）。具体而言，每个局部极小值都可以表示为两种类型的子配置的无重叠组合：将单个均值估计拟合到多个高斯组分中或将多个估计拟合到单个真实组分中。这些结果适用于真实混合成分满足某种分离条件的情况，并且即使组分数量过多或不足，也是有效的。我们还为具有三个组分的一维GMM的情况提供了更精细的分析，这提供了更锐利的近似误差界限，具有改进的依赖性。