Modern engineering systems, such as autonomous vehicles, flexible robotics, and intelligent aerospace platforms, require controllers that are robust to uncertainties, adaptive to environmental changes, and safety-aware under real-time constraints. RL offers powerful data-driven adaptability for systems with nonlinear dynamics that interact with uncertain environments. RL, however, lacks built-in mechanisms for dynamic constraint satisfaction during exploration. MPC offers structured constraint handling and robustness, but its reliance on accurate models and computationally demanding online optimization may pose significant challenges. This paper proposes an integrated MPC-RL framework that combines stability and safety guarantees of MPC with the adaptability of RL. During training, MPC defines safe control bounds that guide the RL component and that enable constraint-aware policy learning. At deployment, the learned policy operates in real time with a lightweight safety filter based on Lipschitz continuity to ensure constraint satisfaction without heavy online optimizations. The approach, which is validated on a nonlinear aeroelastic wing system, demonstrates improved disturbance rejection, reduced actuator effort, and robust performance under turbulence. The architecture generalizes to other domains with structured nonlinearities and bounded disturbances, offering a scalable solution for safe artificial-intelligence-driven control in engineering applications.
翻译:现代工程系统,如自动驾驶车辆、柔性机器人及智能航空航天平台,需要控制器在实时约束下具备对不确定性的鲁棒性、对环境变化的自适应性以及安全感知能力。强化学习为具有非线性动力学且与不确定环境交互的系统提供了强大的数据驱动自适应能力。然而,强化学习在探索过程中缺乏动态约束满足的内置机制。模型预测控制提供了结构化的约束处理与鲁棒性,但其对精确模型的依赖及计算密集的在线优化可能带来显著挑战。本文提出一种集成模型预测控制-强化学习框架,将模型预测控制的稳定性与安全保证与强化学习的自适应性相结合。在训练阶段,模型预测控制定义安全控制边界,引导强化学习组件并实现约束感知的策略学习。在部署阶段,学习到的策略通过基于利普希茨连续性的轻量级安全滤波器实时运行,确保约束满足而无需繁重的在线优化。该方法在非线性气动弹性机翼系统上验证,表现出改进的抗干扰能力、降低的执行器能耗及湍流下的鲁棒性能。该架构可推广至其他具有结构化非线性与有界干扰的领域,为工程应用中安全的人工智能驱动控制提供可扩展解决方案。