This paper investigates the use of extended Kalman filtering to train recurrent neural networks with rather general convex loss functions and regularization terms on the network parameters, including $\ell_1$-regularization. We show that the learning method outperforms stochastic gradient descent in a nonlinear system identification benchmark and in training a linear system with binary outputs. We also explore the use of the algorithm in data-driven nonlinear model predictive control and its relation with disturbance models for offset-free closed-loop tracking.
翻译:本文调查了使用扩大的卡尔曼过滤法来培训经常性神经网络,具有相当一般的孔螺损失功能和网络参数的正规化条件,包括$\ell_1$-正规化。我们显示,学习方法在非线性系统识别基准和二元产出线性系统中的随机梯度下降率优于非线性系统。我们还探索在数据驱动的非线性模型预测控制中使用算法,及其与干扰模型的关系,以便进行无偏偏闭的闭环跟踪。