The recent progress in Reinforcement Learning applications to Resource Management presents MDPs without a deeper analysis of the impacts of design decisions on agent performance. In this paper, we compare and contrast four different MDP variations, discussing their computational requirements and impacts on agent performance by means of an empirical analysis. We conclude by showing that, in our experiments, when using Multi-Layer Perceptrons as approximation function, a compact state representation allows transfer of agents between environments, and that transferred agents have good performance and outperform specialized agents in 80\% of the tested scenarios, even without retraining.
翻译:最近在将强化学习应用应用于资源管理方面取得的进步,在未深入分析设计决定对代理人性能的影响的情况下,提出了多用途发展方案。在本文中,我们比较和对比了四种不同的多用途发展方案变化,通过经验分析来讨论它们的计算要求和对代理人性能的影响。我们的结论是,在我们的实验中,当将多层概念作为近距离功能时,一个紧凑的国家代表制允许在环境之间转让代理人,在经过测试的情景中,有80个(b)情况下,即使没有再培训,被转让的代理人的性能良好,并且优于专业代理人。