基于逆向强化学习和人工智能的移动机器人自主学习方法研究

项目名称： 基于逆向强化学习和人工智能的移动机器人自主学习方法研究

项目编号： No.61305121

项目类型： 青年科学基金项目

立项/批准年度： 2014

项目学科： 自动化技术、计算机技术

项目作者： 李德才

作者单位： 中国科学院沈阳自动化研究所

项目金额： 23万元

中文摘要： 移动机器人在复杂环境下的自主运动往往十分困难。因此如何提升机器人的智能水平，增强其在不确定环境下的自主行为能力具有较强的理论价值与现实意义。本项目针对逆向强化学习在示教策略有限和不确定条件下可能导致的学习精度不高、计算效率低下等问题，对下列内容进行研究：1）采用回声状态网络、极端学习机等智能方法建立回报函数模型。在此基础上，根据模型结构建立适当的惩罚函数，将状态特征选择同建模算法相结合，提出全新的回报函数表示方法；2）针对示教策略的不确定性，通过引入对噪声和异常点具有鲁棒性的似然函数，实现对示教轨迹中干扰信号的识别和抑制。并进一步，根据非最优示教轨迹来逼近理想情况下的回报函数；3）在研究内容1)和2)的基础上，建立面向多自主体的逆向强化学习方法，以克服单个机器人工作能力有限的问题。本项目将逆向强化学习方法与人工智能相结合，为复杂环境下移动机器人的自主学习问题提供了新的思路和方法。

中文关键词： 逆向强化学习；人工智能；移动机器人；自主学习；

英文摘要： It is a difficult mission to realize autonomous behavior of mobile robots with complex environment. Hence, how to enhance the intelligence level of robots as well as its capability of autonomous behavior with uncertainness environment have significant theoretical value and practical meaning. Regarding the finite and uncertain demonstrators in the inverse reinforcement learning and the problems they might lead to, the project focus on the following aspects: 1) Applying machine learning methods, such as echo state network and extreme learning machine to establish reward function model. Then, combing state feature selection and modeling method by constructing appropriate penalty function and propose a novel approach for reward function representation. 2) For the uncertainness of the demonstrators, the influence of interference signal in the demonstrator trajectory can be restricted by employing appropriate likelihood function which is robust to noise and outliers. In this case, reward function with ideal demonstrator trajectories can be approximated by using suboptimal ones. 3) Based on the research contents 1) and 2), multi-agent inverse reinforcement learning method will be investigated in order to overcome the limited operational capability of single robot. This project combines the inverse reinforcement learnin

英文关键词： inverse reinforcement learning；artificial intelligence；mobile robot；autonomous learning；

成为VIP会员查看完整内容