节点运行模式状态自适应的自组织型排队网络研究

项目名称： 节点运行模式状态自适应的自组织型排队网络研究

项目编号： No.71201026

项目类型： 青年科学基金项目

立项/批准年度： 2013

项目学科： 管理科学与工程

项目作者： 张智聪

作者单位： 东莞理工学院

项目金额： 19万元

中文摘要： 随着物联网等网络技术的快速发展和广泛应用，自组织型排队网络优化问题作为一类新型的排队网络问题，其学术和应用价值日益重要。本项目拟研究一类节点（服务台）具有多种运行模式、顾客传送路径和网络结构均具有自组织特性的排队网络控制问题。把该问题抽象为一类新型的多目标嵌套式半马尔可夫决策过程，再基于耦合式增强学习架构构建控制决策模型，并采用结合支持向量机函数泛化器的自适应步长增强学习算法来求解，获得集成网络节点运行模式自适应调整策略、路径选择策略和顾客发送排序策略于一体的控制策略。研究的主要价值在于提出嵌套式半马尔可夫决策过程的概念、增强学习算法的自适应学习步长调节机制，并为一类自组织型排队网络提供同时优化加权平均流程时间、网络运行成本等多个目标的整体优化方案。通过开展本项目以期丰富自组织型排队网络控制领域的理论方法和应用研究。

中文关键词： 嵌套式马尔可夫过程；增强学习；排队网络；自组织；

英文摘要： The networks techonlogy such as Internet of Things has been rapidly developed and extensively applied. The optimization problem of self-organized queueing networks is a new type of queueing networks problem. The academic and applying value has been gradually highlighted. This project studies a queueing networks control problem characterized with multi-operation-mode nodes or severs, transportation self-organized paths and self-organized networks architecture. We formulate the problem as the multi-objective Nested Semi-Markov Decision Process. Then we build the control model based on Reinforcement Learning architecture, and solve the problem by step size self-adaptive Reinforcement Learning algorithm combining a function approximator based on the support vector machine (SVM). We obtain a control policy integrating operation modes adjusting for the nodes, path selecting and customers sequencing. The main value of our study lies on proposing the concept of Nested Semi-Markov Decision Process and the step size self-adaptive Reinforcement Learning algorithm. We also provide the multi-objective global optimization solutions for a new type of self-organized queueing networks, e.g. shortening the weighted mean flow time and reducing the cost of operation simultaneously. We aim at enriching the theory and application stu

英文关键词： Nested Markov Decision Process；Reinforcement Learning；Queueing Networks；Self-organization；

成为VIP会员查看完整内容