Autoregressive exogenous (ARX) systems are the general class of input-output dynamical systems used for modeling stochastic linear dynamical systems (LDS) including partially observable LDS such as LQG systems. In this work, we study the problem of system identification and adaptive control of unknown ARX systems. We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection. Using these guarantees, we design adaptive control algorithms for unknown ARX systems with arbitrary strongly convex or convex quadratic regulating costs. Under strongly convex cost functions, we design an adaptive control algorithm based on online gradient descent to design and update the controllers that are constructed via a convex controller reparametrization. We show that our algorithm has $\tilde{\mathcal{O}}(\sqrt{T})$ regret via explore and commit approach and if the model estimates are updated in epochs using closed-loop data collection, it attains the optimal regret of $\text{polylog}(T)$ after $T$ time-steps of interaction. For the case of convex quadratic cost functions, we propose an adaptive control algorithm that deploys the optimism in the face of uncertainty principle to design the controller. In this setting, we show that the explore and commit approach has a regret upper bound of $\tilde{\mathcal{O}}(T^{2/3})$, and the adaptive control with continuous model estimate updates attains $\tilde{\mathcal{O}}(\sqrt{T})$ regret after $T$ time-steps.
翻译:自动递增外源系统(ARX)是用于模拟Stochatic线性动态系统(LDS)的输入-输出动态系统(LDS)的一般类别,包括部分可见LDS,如LQG系统。在这项工作中,我们研究了系统识别和调适对未知ARX系统的控制问题。我们通过开放环和闭环数据收集为ARX系统提供了有限时间学习保障。我们利用这些保证,为使用任意强烈 convex 或 convex 等调控成本的未知ARX系统设计了适应性控制算法。在强烈的调控成本功能下,我们设计了一个基于在线梯度下降的适应性控制算法,设计和更新了通过 convex 控制器重新校正构建的控制控制器。我们通过探索和承诺方式为ARXx 提供了固定时间学习保障,如果模型在ephochs 中使用闭环数据更新了模型,它就实现了$\ conpoly $}(T$) 继续调价调调调调调调调算法, 在 美元后,我们提议了Ox 的调算法 度的调化后, 度 度 度 度的校正平比值 。