This letter studies the problem of online multi-step-ahead prediction for unknown linear stochastic systems. Using conditional distribution theory, we derive an optimal parameterization of the prediction policy as a linear function of future inputs, past inputs, and past outputs. Based on this characterization, we propose an online least-squares algorithm to learn the policy and analyze its regret relative to the optimal model-based predictor. We show that the online algorithm achieves logarithmic regret with respect to the optimal Kalman filter in the multi-step setting. Furthermore, with new proof techniques, we establish an almost-sure regret bound that does not rely on fixed failure probabilities for sufficiently large horizons $N$. Finally, our analysis also reveals that, while the regret remains logarithmic in $N$, its constant factor grows polynomially with the prediction horizon $H$, with the polynomial order set by the largest Jordan block of eigenvalue 1 in the system matrix.
翻译:本文研究了未知线性随机系统的在线多步超前预测问题。利用条件分布理论,我们推导了预测策略作为未来输入、历史输入和历史输出的线性函数的最优参数化形式。基于此表征,我们提出了一种在线最小二乘算法来学习该策略,并分析了其相对于最优基于模型预测器的遗憾。我们证明,在多步预测场景下,该在线算法相对于最优卡尔曼滤波器实现了对数遗憾。此外,通过新的证明技术,我们建立了几乎必然的遗憾界,该界在足够大的时间范围$N$下不依赖于固定失败概率。最后,我们的分析还表明,尽管遗憾相对于$N$保持对数增长,但其常数因子随预测时域$H$呈多项式增长,多项式阶数由系统矩阵中特征值为1的最大约当块决定。