Recent deep reinforcement learning (DRL) successes rely on end-to-end learning from fixed-size observational inputs (e.g. image, state-variables). However, many challenging and interesting problems in decision making involve observations or intermediary representations which are best described as a set of entities: either the image-based approach would miss small but important details in the observations (e.g. ojects on a radar, vehicles on satellite images, etc.), the number of sensed objects is not fixed (e.g. robotic manipulation), or the problem simply cannot be represented in a meaningful way as an image (e.g. power grid control, or logistics). This type of structured representations is not directly compatible with current DRL architectures, however, there has been an increase in machine learning techniques directly targeting structured information, potentially addressing this issue. We propose to combine recent advances in set representations with slot attention and graph neural networks to process structured data, broadening the range of applications of DRL algorithms. This approach allows to address entity-based problems in an efficient and scalable way. We show that it can improve training time and robustness significantly, and demonstrate their potential to handle structured as well as purely visual domains, on multiple environments from the Atari Learning Environment and Simple Playgrounds.
翻译:最近深入强化学习(DRL)的成功取决于从固定规模观测投入(例如图像、国家变量)中从端到端的学习。然而,许多具有挑战性和令人感兴趣的决策问题涉及观察或中间代表,最能被描述为一组实体:以图像为基础的方法会错过观测中虽小但重要的细节(例如雷达上的弹片、卫星图像上的飞行器等),感测对象的数量没有固定(例如机器人操纵),或者问题根本无法以有意义的方式作为图像(例如电网控制或物流)来体现。这种结构化的表述方式并不直接与目前的DRL结构直接兼容,然而,直接针对结构化信息的机器学习技术有所增加,有可能解决这一问题。我们提议将设置表达方式的最新进展与时空关注和图神经网络与处理结构化数据相结合,扩大DRL算法的应用范围,从而能够以高效和可伸缩的方式解决基于实体的问题(例如电网控制、或物流)。我们表明,这种结构化的表达方式可以大大改善培训时间和稳健度,从视觉空间到展示其潜在的视觉环境。