Reinforcement Learning (RL) controllers have generated excitement within the control community. The primary advantage of RL controllers relative to existing methods is their ability to optimize uncertain systems independently of explicit assumption of process uncertainty. Recent focus on engineering applications has been directed towards the development of safe RL controllers. Previous works have proposed approaches to account for constraint satisfaction through constraint tightening from the domain of stochastic model predictive control. Here, we extend these approaches to account for plant-model mismatch. Specifically, we propose a data-driven approach that utilizes Gaussian processes for the offline simulation model and use the associated posterior uncertainty prediction to account for joint chance constraints and plant-model mismatch. The method is benchmarked against nonlinear model predictive control via case studies. The results demonstrate the ability of the methodology to account for process uncertainty, enabling satisfaction of joint chance constraints even in the presence of plant-model mismatch.
翻译:强化学习控制器(RL)控制器在现有方法方面的主要优势是,它们有能力在明确假定过程不确定的情况下优化不确定的系统。最近对工程应用的侧重是开发安全的RL控制器。以前的工作提议了一些方法,通过从随机模型预测控制领域收紧限制措施,对限制满意度进行衡算。在这里,我们将这些方法推广到对植物模型不匹配进行衡算。具体地说,我们提议采用数据驱动方法,利用高山程序进行离线模拟模型,并利用相关的事后不确定性预测来计算联合机会限制和植物模型不匹配。该方法以非线性模型预测控制为基准,通过案例研究衡量非线性模型预测控制。结果显示该方法有能力对过程不确定性进行衡算,即使存在植物模型不匹配,也能够满足共同的机会限制。