Identification of linear time-invariant (LTI) systems plays an important role in control and reinforcement learning. Both asymptotic and finite-time offline system identification are well-studied in the literature. For online system identification, the idea of stochastic-gradient descent with reverse experience replay (SGD-RER) was recently proposed, where the data sequence is stored in several buffers and the stochastic-gradient descent (SGD) update performs backward in each buffer to break the time dependency between data points. Inspired by this work, we study distributed online system identification of LTI systems over a multi-agent network. We consider agents as identical LTI systems, and the network goal is to jointly estimate the system parameters by leveraging the communication between agents. We propose DSGD-RER, a distributed variant of the SGD-RER algorithm, and theoretically characterize the improvement of the estimation error with respect to the network size. Our numerical experiments certify the reduction of estimation error as the network size grows.
翻译:线性时间变量系统(LTI)的识别在控制和强化学习中起着重要作用。 文献中很好地研究了无症状和有限时间离线系统识别问题。 对于在线系统识别,最近提出了具有反向经验回放(SGD-RER)的随机渐渐下降的想法, 数据序列存储在几个缓冲器中, 和随机偏差下行更新(SGD) 在每个缓冲器中表现后退, 以打破数据点之间的时间依赖性。 在这项工作的启发下, 我们研究在多试剂网络上传播LTI系统的在线系统识别。 我们认为代理器是相同的LTI系统, 网络目标是通过利用代理器之间的通信来共同估计系统参数。 我们提议了SGD-RER, 即SGD-RER算法的分布式变量, 并在理论上说明网络大小方面的估计错误的改进。 我们的数字实验证明随着网络规模的扩大而减少了估计错误。