In this work we apply deep reinforcement learning to the problems of navigating a three-dimensional environment and inferring the locations of human speaker audio sources within, in the case where the only available information is the raw sound from the environment, as a simulated human listener placed in the environment would hear it. For this purpose we create two virtual environments using the Unity game engine, one presenting an audio-based navigation problem and one presenting an audio source localization problem. We also create an autonomous agent based on PPO online reinforcement learning algorithm and attempt to train it to solve these environments. Our experiments show that our agent achieves adequate performance and generalization ability in both environments, measured by quantitative metrics, even when a limited amount of training data are available or the environment parameters shift in ways not encountered during training. We also show that a degree of agent knowledge transfer is possible between the environments.
翻译:在这项工作中,我们运用深度强化学习方法,解决在三维环境中航行的问题,并推断在唯一可用信息是来自环境的原始声音的情况下,在环境中的模拟人类听众会听到这种声音的情况下,在三维环境中的人语声源的位置。为此,我们利用团结游戏引擎创建了两种虚拟环境,一种是显示音频导航问题,另一种是显示音频源定位问题。我们还根据PPO在线强化学习算法创建一个自主代理器,并试图对其进行培训以解决这些问题。我们的实验表明,我们的代理器在两种环境中都取得了足够的性能和一般化能力,用量化的尺度来衡量,即使培训数据数量有限,或者环境参数以培训中未遇到的方式转移。我们还表明,在两种环境之间可以进行一定程度的代理知识转让。