We present an approach for efficiently training Gaussian Mixture Model (GMM) by Stochastic Gradient Descent (SGD) with non-stationary, high-dimensional streaming data. Our training scheme does not require data-driven parameter initialization (e.g., k-means) and can thus be trained based on a random initialization. Furthermore, the approach allows mini-batch sizes as low as 1, which are typical for streaming-data settings. Major problems in such settings are undesirable local optima during early training phases and numerical instabilities due to high data dimensionalities. We introduce an adaptive annealing procedure to address the first problem, whereas numerical instabilities are eliminated by using an exponential-free approximation to the standard GMM log-likelihood. Experiments on a variety of visual and non-visual benchmarks show that our SGD approach can be trained completely without, for instance, k-means based centroid initialization. It also compares favorably to an online variant of Expectation-Maximization (EM) - stochastic EM (sEM), which it outperforms by a large margin for very high-dimensional data.
翻译:我们提出了一个高效培训方法,由Stochastic Gratic Emple (SGD) 以非静止的、高维的流流数据来高效培训Gossian Mixture模型(GMMM),我们的培训计划并不要求数据驱动参数初始化(例如,k- means),因此可以随机初始化来培训。此外,该方法允许以流数据设置为典型的低至1的微型批量尺寸。这种环境中的主要问题在早期培训阶段是当地不可取的opima,并且由于高数据维度而造成数字不稳定性。我们引入了一种适应性Anneal 程序来解决第一个问题,而数字不稳定性则通过对标准的GMM日志相似性使用无指数的近似度来消除。关于各种视觉和非视觉基准的实验表明,我们的SGD方法可以完全培训,例如没有以k-poors为基础的中位初始化。它也比得上一个预期-最大度的在线变异(EM)- stochetric EM(sEM),它通过高维的数据比高差值。