The capacity for rapid domain adaptation is important to increasing the applicability of reinforcement learning (RL) to real world problems. Generalization of RL agents is critical to success in the real world, yet zero-shot policy transfer is a challenging problem since even minor visual changes could make the trained agent completely fail in the new task. We propose USRA: Unified State Representation Learning under Data Augmentation, a representation learning framework that learns a latent unified state representation by performing data augmentations on its observations to improve its ability to generalize to unseen target domains. We showcase the success of our approach on the DeepMind Control Generalization Benchmark for the Walker environment and find that USRA achieves higher sample efficiency and 14.3% better domain adaptation performance compared to the best baseline results.
翻译:快速领域适应能力对于提高强化学习(RL)对现实世界问题的适用性十分重要。 普及强化学习(RL)对于现实世界的成功至关重要,但零射政策转移是一个具有挑战性的问题,因为即使是轻微的视觉变化也可能使受过训练的代理完全无法完成新任务。 我们提议美国农业局:在数据增强下统一国家代表制学习,这是一个代表性学习框架,通过在观测中进行数据增强,了解潜在的统一国家代表制,从而提高其向看不见目标领域推广的能力,从而了解潜在的统一国家代表制。 我们展示了我们在关于沃克环境的深度控制通用基准方面的做法的成功,发现美国农业局取得了更高的抽样效率,与最佳基线结果相比,14.3%的域适应性业绩更好。