While several high profile video games have served as testbeds for Deep Reinforcement Learning (DRL), this technique has rarely been employed by the game industry for crafting authentic AI behaviors. Previous research focuses on training super-human agents with large models, which is impractical for game studios with limited resources aiming for human-like agents. This paper proposes a sample-efficient DRL method tailored for training and fine-tuning agents in industrial settings such as the video game industry. Our method improves sample efficiency of value-based DRL by leveraging pre-collected data and increasing network plasticity. We evaluate our method training a goalkeeper agent in EA SPORTS FC 25, one of the best-selling football simulations today. Our agent outperforms the game's built-in AI by 10% in ball saving rate. Ablation studies show that our method trains agents 50% faster compared to standard DRL methods. Finally, qualitative evaluation from domain experts indicates that our approach creates more human-like gameplay compared to hand-crafted agents. As a testament to the impact of the approach, the method has been adopted for use in the most recent release of the series.
翻译:尽管多款知名电子游戏已成为深度强化学习(DRL)的测试平台,但游戏产业在塑造逼真AI行为时却鲜少采用此技术。先前研究主要集中于利用大型模型训练超人类智能体,这对于资源有限、旨在实现拟人化智能体的游戏工作室而言并不实用。本文提出一种样本高效的DRL方法,专为视频游戏产业等工业环境中的智能体训练与微调而设计。该方法通过利用预收集数据与增强网络可塑性,提升了基于价值的DRL的样本效率。我们在当前畅销足球模拟游戏《EA SPORTS FC 25》中训练守门员智能体以评估本方法。实验表明,该智能体的扑救成功率比游戏内置AI高出10%。消融研究显示,相较于标准DRL方法,本方法训练速度提升50%。领域专家的定性评估进一步指出,相比人工设计的智能体,本方法能产生更贴近人类行为的游戏表现。作为方法影响力的实证,该技术已被该系列最新版本采纳应用。