In data processing and machine learning, an important challenge is to recover and exploit models that can represent accurately the data. We consider the problem of recovering Gaussian mixture models from datasets. We investigate symmetric tensor decomposition methods for tackling this problem, where the tensor is built from empirical moments of the data distribution. We consider identifiable tensors, which have a unique decomposition, showing that moment tensors built from spherical Gaussian mixtures have this property. We prove that symmetric tensors with interpolation degree strictly less than half their order are identifiable and we present an algorithm, based on simple linear algebra operations, to compute their decomposition. Illustrative experimentations show the impact of the tensor decomposition method for recovering Gaussian mixtures, in comparison with other state-of-the-art approaches.
翻译:在数据处理和机器学习方面,一项重要挑战是回收和利用能够准确代表数据的模型。我们考虑了从数据集中回收高斯混合模型的问题。我们研究了用于解决这一问题的对称高斯混合模型;我们研究了用于解决这一问题的对称高尔格分解方法,这里的抗拉是根据数据分布的经验时刻构建的。我们考虑了可识别的抗拉,它们具有独特的分解作用,表明从球状高斯混合物中制造的瞬间抗拉具有这一特性。我们证明,具有严格低于一半的内推等级的对称抗拉是可识别的,我们根据简单的直线代数操作提出了算法,以计算其分解状态。与其它最先进的方法相比,隐喻性实验显示了抗拉尔分解方法对恢复高斯混合物的影响。