通过强化学习积累经验的PCG:超级Mario Bross研究 (Experience-Driven PCG via Reinforcement Learning: A Super Mario Bros Study)

We introduce a procedural content generation (PCG) framework at the intersections of experience-driven PCG and PCG via reinforcement learning, named ED(PCG)RL, EDRL in short. EDRL is able to teach RL designers to generate endless playable levels in an online manner while respecting particular experiences for the player as designed in the form of reward functions. The framework is tested initially in the Super Mario Bros game. In particular, the RL designers of Super Mario Bros generate and concatenate level segments while considering the diversity among the segments. The correctness of the generation is ensured by a neural net-assisted evolutionary level repairer and the playability of the whole level is determined through AI-based testing. Our agents in this EDRL implementation learn to maximise a quantification of Koster's principle of fun by moderating the degree of diversity across level segments. Moreover, we test their ability to design fun levels that are diverse over time and playable. Our proposed framework is capable of generating endless, playable Super Mario Bros levels with varying degrees of fun, deviation from earlier segments, and playability. EDRL can be generalised to any game that is built as a segment-based sequential process and features a built-in compressed representation of its game content.

翻译：我们通过强化学习在经验驱动的PCG和PCG交汇处引入了一个程序内容生成框架(PCG),名为ED(PCG)RL,简称EDL。EDRL能够教RL设计师以在线方式生成无尽的可玩水平,同时尊重以奖励功能形式设计的玩家的特殊经验。这个框架最初在超级Mario Bros游戏中测试。特别是,超级Mario Bros的RL设计师在考虑各区间差异的同时,生成和融合各区段。由神经网辅助的进化水平修补器确保这一代的正确性,整个层次的可播放性通过AI测试确定。我们EDRL执行过程中的代理商学会通过调制不同区间多样性的程度来最大限度地量化Koster的娱乐原则。此外,我们测试他们设计不同时间和可播放的有趣水平的能力。我们提议的框架能够产生无尽、可播放的超级Mario Bros,不同程度的超级Mario Bros,并且整个层次的可播放性是通过基于AI的测试。EDL的代理器可以构建一个普通的游戏段。