We introduce a new algorithm for expected log-likelihood maximization in situations where the objective function is multi-modal and/or has saddle points, that we term G-PFSO. The key idea underpinning G-PFSO is to define a sequence of probability distributions which (a) is shown to concentrate on the target parameter value and (b) can be efficiently estimated by means of a standard particle filter algorithm. These distributions depends on a learning rate, where the faster the learning rate is the faster is the rate at which they concentrate on the desired parameter value but the lesser is the ability of G-PFSO to escape from a local optimum of the objective function. To conciliate ability to escape from a local optimum and fast convergence rate, the proposed estimator exploits the acceleration property of averaging, well-known in the stochastic gradient literature. Based on challenging estimation problems, our numerical experiments suggest that the estimator introduced in this paper converges at the optimal rate, and illustrate the practical usefulness of G-PFSO for parameter inference in large datasets. If the focus of this work is expected log-likelihood maximization the proposed approach and its theory apply more generally for optimizing a function defined through an expectation.
翻译:在目标功能为多模式和/或具有马鞍点的情况下,我们为预期的日志最大化引入一种新的算法,我们称之为G-PFSO。G-PFSO的关键理念是确定概率分布序列,以便(a) 显示集中于目标参数值,和(b) 可以通过标准的粒子过滤算法有效估算。这些分布取决于学习率,学习率越快,学习率越快,学习率就越注重理想参数值,但越少的是G-PFSO从目标函数的当地最佳组合率中逃脱的能力。为了调和从当地最佳和快速汇合率中逃脱的能力,拟议的估算员利用平均加速值的属性,这是在随机梯度文献中广为人知的。根据具有挑战性的估算问题,我们的数字实验表明,本文中引入的估测值在最佳率上趋于一致,说明G-PFSO对大数据集参数推算的实际用途越小。如果这项工作的侧重点是预期的极值,则通过最优化的理论化法将其应用为最优化。