通过启动的多复制性噪音进行强力学习控制 (Robust Learning-Based Control via Bootstrapped Multiplicative Noise)

Despite decades of research and recent progress in adaptive control and reinforcement learning, there remains a fundamental lack of understanding in designing controllers that provide robustness to inherent non-asymptotic uncertainties arising from models estimated with finite, noisy data. We propose a robust adaptive control algorithm that explicitly incorporates such non-asymptotic uncertainties into the control design. The algorithm has three components: (1) a least-squares nominal model estimator; (2) a bootstrap resampling method that quantifies non-asymptotic variance of the nominal model estimate; and (3) a non-conventional robust control design method using an optimal linear quadratic regulator (LQR) with multiplicative noise. A key advantage of the proposed approach is that the system identification and robust control design procedures both use stochastic uncertainty representations, so that the actual inherent statistical estimation uncertainty directly aligns with the uncertainty the robust controller is being designed against. We show through numerical experiments that the proposed robust adaptive controller can significantly outperform the certainty equivalent controller on both expected regret and measures of regret risk.

翻译：尽管进行了数十年的研究,并在适应性控制和增强学习方面取得了最近的进展,但在设计控制器方面仍然存在着根本的缺乏理解,这些控制器能够对使用有限、噪音数据估计的模型产生的固有的非不便性不确定性提供稳健性。我们提议了一种强有力的适应性控制算法,明确将此类非非不便性不确定性纳入控制设计。该算法有三个组成部分:(1) 一种最低平方名义模型估测器;(2) 一种对名义模型估计中非不适为性差异进行量化的靴式抽查方法;(3) 一种非常规的稳健控制设计法,使用具有倍增噪音的最佳线性线性二次调节器(LQR) 。拟议方法的一个关键优点是,系统识别和稳健的控制设计程序既使用随机性不确定性表,从而使实际固有的统计性估算不确定性与正在设计的稳健健的控制器所要面对的不确定性直接一致。我们通过数字实验表明,拟议的稳健的调控器在预期的遗憾和遗憾风险措施方面,大大超出等同的控制器的确定性。