Reinforcement learning and symbolic planning have both been used to build intelligent autonomous agents. Reinforcement learning relies on learning from interactions with real world, which often requires an unfeasibly large amount of experience. Symbolic planning relies on manually crafted symbolic knowledge, which may not be robust to domain uncertainties and changes. In this paper we present a unified framework {\em PEORL} that integrates symbolic planning with hierarchical reinforcement learning (HRL) to cope with decision-making in a dynamic environment with uncertainties. Symbolic plans are used to guide the agent's task execution and learning, and the learned experience is fed back to symbolic knowledge to improve planning. This method leads to rapid policy search and robust symbolic plans in complex domains. The framework is tested on benchmark domains of HRL.
翻译:强化学习和象征性规划都被用于建立智能自主机构。强化学习依赖于从与现实世界的互动中学习,而现实世界往往需要大量不可行的经验。象征性规划依赖于手工制作的象征性知识,而这种知识可能不足以覆盖不确定因素和变化。在本文件中,我们提出了一个统一的框架,将象征性规划与等级强化学习结合起来,以便在充满不确定性的动态环境中应对决策。符号计划被用来指导代理人的任务执行和学习,而所学到的经验又被反馈到象征性知识,以改善规划。这种方法导致在复杂领域迅速进行政策搜索和强有力的象征性计划。框架在人力资源水平基准领域进行测试。