We develop a new continuous-time stochastic gradient descent method for optimizing over the stationary distribution of stochastic differential equation (SDE) models. The algorithm continuously updates the SDE model's parameters using an estimate for the gradient of the stationary distribution. The gradient estimate is simultaneously updated, asymptotically converging to the direction of steepest descent. We rigorously prove convergence of our online algorithm for linear SDE models and present numerical results for nonlinear examples. The proof requires analysis of the fluctuations of the parameter evolution around the direction of steepest descent. Bounds on the fluctuations are challenging to obtain due to the online nature of the algorithm (e.g., the stationary distribution will continuously change as the parameters change). We prove bounds for the solutions of a new class of Poisson partial differential equations, which are then used to analyze the parameter fluctuations in the algorithm.
翻译:我们开发了新的连续时间梯度梯度下降法, 以优化Stochatic 差分方程(SDE)模型的固定分布。 算法持续更新SDE模型参数, 使用固定分布梯度的估计值。 梯度估计同时更新, 零点与最陡度下降方向相融合。 我们严格证明线性SDE模型的在线算法趋同, 并且为非线性实例提供数字结果。 证明要求分析参数变化在最陡度下降方向的波动。 由于算法的在线性质( 例如, 固定分布会随着参数的变化而不断变化), 有关参数变化的曲线变化是难以获得的。 我们证明新的Poisson 部分差异方程的解决方案是界限, 然后用来分析算法的参数波动。