The Expectation Maximization (EM) algorithm is of key importance for inference in latent variable models including mixture of regressors and experts, missing observations. This paper introduces a novel EM algorithm, called \texttt{SPIDER-EM}, for inference from a training set of size $n$, $n \gg 1$. At the core of our algorithm is an estimator of the full conditional expectation in the {\sf E}-step, adapted from the stochastic path-integrated differential estimator ({\tt SPIDER}) technique. We derive finite-time complexity bounds for smooth non-convex likelihood: we show that for convergence to an $\epsilon$-approximate stationary point, the complexity scales as $K_{\operatorname{Opt}} (n,\epsilon )={\cal O}(\epsilon^{-1})$ and $K_{\operatorname{CE}}( n,\epsilon ) = n+ \sqrt{n} {\cal O}(\epsilon^{-1} )$, where $K_{\operatorname{Opt}}( n,\epsilon )$ and $K_{\operatorname{CE}}(n, \epsilon )$ are respectively the number of {\sf M}-steps and the number of per-sample conditional expectations evaluations. This improves over the state-of-the-art algorithms. Numerical results support our findings.
翻译:期望最大化( EM) 算法对于潜在变量模型的推断至关重要, 包括递减者和专家的混合, 缺少观察 。 本文介绍了一个新的 EM 算法, 称为\ textt{ PIDER- EM}, 用于从规模为$n, $n\ gg 1 的一组培训中推断 。 在我们算法的核心, 是对 { sf E} 步骤中完全有条件期望的估测符 : 校正路径- 集成差异估量( tt) 技术的调整 。 我们为平滑的非 conx 可能性推出一个新的 EM 算法, 称为\ texttt{ PIDR- EM} 。 我们显示, 趋同 $@ operatorname { (n,\ epslon) 和 $@ potor nual_ nual_ number { { ral_ ral_ rus_ r_ r_ rual_ ad_ r_ rus_ rual_ rus_ rus_ r_ rus_ rus_ r_ rus_ r_ r_ rus_ rus_ r_ r_ r_ rus_ r_ r_ r_ r_ rus_ r_ r_ rus_ r_ rus_ r_ r_ r_ r_ r_ rus_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r___ r_ r_ r____ r_ r_ r_ r_ r_ r_ r___ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_ r_