We tackle real-world problems with complex structures beyond the pixel-based game or simulator. We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph that defines a set of subtasks and their dependencies that are unknown to the agent. Different from the previous meta-rl methods trying to directly infer the unstructured task embedding, our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks, and use it as a prior to improve the task inference in testing. Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks than various existing algorithms such as meta reinforcement learning, hierarchical reinforcement learning, and other heuristic agents.
翻译:我们在像素游戏或模拟器以外的复杂结构中处理现实世界问题。 我们把它设计成一个微小的强化学习问题, 任务以子任务图为特征, 以亚任务图为特征, 定义一组子任务及其代理人所不知道的依附关系。 不同于以前试图直接推导未结构化任务嵌入的元螺旋式方法, 我们的多任务子任务子任务图推导器( MTSGI)首先从培训任务中从子任务图中推断出共同的高层次任务结构, 并在测试中用它来改进任务推导。 我们在 2D 网格- World 和复杂的网络导航域的实验结果显示, 拟议的方法可以学习和利用任务的共同基本结构, 以更快地适应不可见的任务, 而不是现有的各种算法, 如元强化学习、 等级加固学习 和其他超导剂 。