State representation learning, or the ability to capture latent generative factors of an environment, is crucial for building intelligent agents that can perform a wide variety of tasks. Learning such representations without supervision from rewards is a challenging open problem. We introduce a method that learns state representations by maximizing mutual information across spatially and temporally distinct features of a neural encoder of the observations. We also introduce a new benchmark based on Atari 2600 games where we evaluate representations based on how well they capture the ground truth state variables. We believe this new framework for evaluating representation learning models will be crucial for future representation learning research. Finally, we compare our technique with other state-of-the-art generative and contrastive representation learning methods. The code associated with this work is available at https://github.com/mila-iqia/atari-representation-learning
翻译:国家代表制学习,或捕捉环境潜在遗传因素的能力,对于培养能够履行广泛任务的各种智能剂至关重要。学习这种表现方式而不受奖赏的监督是一个挑战性的开放问题。我们引入了一种方法,通过在观测的神经编码器的空间和时间特点中最大限度地相互了解信息来学习国家代表形式。我们还引入了基于Atari 2600游戏的新基准,据此我们根据它们如何很好地捕捉到地面真相状态变量来评价表现方式。我们认为,评价代表性学习模式的新框架对于未来的代表性学习研究至关重要。最后,我们将我们的技术与其他最先进的基因化和对比性代表学习方法进行比较。与这项工作相关的代码可在https://github.com/mila-iqia/alari-present-learning查阅 https://github.com/meila-iqia/atial-present-lementation-lemessmentation。