基于策略迭代算法的随机Markov跳变系统优化控制研究

项目名称： 基于策略迭代算法的随机Markov跳变系统优化控制研究

项目编号： No.61203051

项目类型： 青年科学基金项目

立项/批准年度： 2013

项目学科： 自动化学科

项目作者： 何舒平

作者单位： 安徽大学

项目金额： 25万元

中文摘要： Markov跳变系统作为一类重要的动态随机系统，其现有的优化控制方法大多是建立在Lyapunov函数基础上的参数优化和性能指标优化，基本属于离线方法。基于策略迭代的控制算法是近几年提出的一种可应用于在线优化控制的算法，与离线方法相比，该方法并不要求系统模型完全已知。本项目拟通过建立描述状态量和控制量关系的一种无限时域的积分代价函数方程，使用策略评估与策略改进步骤，实现随机Markov跳变系统的在线策略迭代优化控制。针对线性跳变系统，结合状态反馈控制方法，在线求解代数Riccati方程以获取迭代的反馈控制器，及在线的优化控制策略；针对非线性跳变系统，应用T-S模糊控制和线性微分包含方法展开工作，并结合Actor-Critic学习控制算法和最小二乘算法，实现其在线的策略迭代优化控制器设计。通过上述内容的研究，提出适合随机Markov跳变系统的新的在线优化控制算法。

中文关键词： 策略迭代；Markov 跳变系统；优化控制；反馈控制；

英文摘要： As a kind of important dynamic stochastic systems, the optimal control schemes of Markov jump systems are mostly confined to parametric and performance index optimization, and these approaches are always based on Lyapunov functional and off-line. Policy iteration algorithm is proposed in recent years that can be used for on-line optimal control. Compared with other existing off-line methods, it does not require all the knowledge of the system internal dynamics. By constructing the infinite horizon integral cost function which describes the relationship between states and control inputs, and applying policy evaluation and policy improvement steps, the on-line policy iteration optimal control of stochastic Markov jump systems is drew up to develop. For linear case, it requires to solve the iterative feedback controller and optimal control scheme associated with algebraic Riccati equation. And for nonlinear case, the T-S fuzzy control model and linear differential inclusion representations are first established for approximating the system model, and then it achieves the policy iteration optimal controller design by using Actor-Critic learning control and least squares methods. Through the research contents, this project aims to propose a new online optimal control algorithm for stochastic Markov jump systems.

英文关键词： Policy iteration；Markov jump systems；Optimal control；Feedback control；

成为VIP会员查看完整内容