项目名称: 基于模糊逻辑的大规模强化学习理论及方法
项目编号: No.61472262
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 自动化技术、计算机技术
项目作者: 刘全
作者单位: 苏州大学
项目金额: 82万元
中文摘要: 本项目针对在解决大规模强化学习问题时存在的维数灾问题,提出基于一型和二型模糊逻辑的强化学习方法。主要思想是将强化学习方法与一型、二型模糊逻辑和神经网络相结合,构建可用于大规模强化学习问题的神经模糊强化学习模型:⑴使用双层模糊推理系统或基于神经元的模糊推理系统对状态空间进行特征表示,可以有效的减少状态维数,加快强化学习算法的收敛速度;⑵构建基于二型模糊推理的二型模糊强化学习模型,进一步提高算法处理不确定性的能力以及对噪声干扰的鲁棒性;⑶采用交叉熵优化方法优化模糊强化学习模型的隶属度函数参数,以提高Q值函数的精确性。⑷将所构建的几个模糊强化学习系统用于大规模Deep Web网络信息搜索中,解决由于状态空间的高维性及语义信息的不确定性引起的Deep Web搜索中收敛速度慢甚至无法收敛的问题。
中文关键词: 强化学习;模糊逻辑;神经网络;函数逼近;基函数优化
英文摘要: In allusion to the problem of the curse of dimensionality when dealing with reinforcement learning problems with large scale, this project puts forward several reinforcement learning methods based on type-1 and type-2 fuzzy logic. The main idea is to construct the neural fuzzy reinforcement learning models applied to the reinforcement learning problems with large scale, which combines the type-1 fuzzy inference, type-2 fuzzy inference and neural fuzzy inference with reinforcement learning methods.Using double layer fuzzy inference system or fuzzy inference system that based on neuro nuits to represent the features of state space, which can efficiently decrease the dimension of state space and increase the speed of convergence; Constructing a type-2 fuzzy reinforcement learning model based on type-2 fuzzy inference, which can improve the ability of handling uncertainty and be robust to noise; In order to improve the accuracy of Q value functions, the cross entropy optimization method is used to optimize the parameters of membership functions; Besides, the project plans to apply the proposed three methods to the algorithms used in the deep web, which can help solve the problems of slow convergence speed or non-convergenc caused by the high dimension of state space or the uncertainties of semantic information.
英文关键词: Reinforcement learning;fuzzy logic;neural network;function approximation;basis function optimization