For training recurrent neural network models of nonlinear dynamical systems from an input/output training dataset based on rather arbitrary convex and twice-differentiable loss functions and regularization terms, we propose the use of sequential least squares for determining the optimal network parameters and hidden states. In addition, to handle non-smooth regularization terms such as L1, L0, and group-Lasso regularizers, as well as to impose possibly non-convex constraints such as integer and mixed-integer constraints, we combine sequential least squares with the alternating direction method of multipliers (ADMM). The performance of the resulting algorithm, that we call NAILS (Nonconvex ADMM Iterations and Least Squares), is tested in a nonlinear system identification benchmark.
翻译:为了从一个输入/产出培训数据集中从输入/产出培训数据集中培训非线性动态系统的经常性神经网络模型,基于相当任意的曲线和两个不同的损失功能和正规化条件,我们提议使用顺序最小方格来确定最佳网络参数和隐藏状态;此外,为了处理L1、L0和群体-Lasso正规化者等非平稳正规化条件,以及为了施加可能的非固定制约,如整数和混合整数限制,我们把顺序最小方格与交替的乘数方向法(ADMM)结合起来。 由此产生的算法的性能,我们称之为NAILIS(Noncovex ADMD迭代和最小方格),是在非线性系统识别基准中测试的。