We propose ScheduleNet, a RL-based real-time scheduler, that can solve various types of multi-agent scheduling problems. We formulate these problems as a semi-MDP with episodic reward (makespan) and learn ScheduleNet, a decentralized decision-making policy that can effectively coordinate multiple agents to complete tasks. The decision making procedure of ScheduleNet includes: (1) representing the state of a scheduling problem with the agent-task graph, (2) extracting node embeddings for agent and tasks nodes, the important relational information among agents and tasks, by employing the type-aware graph attention (TGA), and (3) computing the assignment probability with the computed node embeddings. We validate the effectiveness of ScheduleNet as a general learning-based scheduler for solving various types of multi-agent scheduling tasks, including multiple salesman traveling problem (mTSP) and job shop scheduling problem (JSP).
翻译:我们提议建立基于RL的实时调度系统,即调度网,以解决多种试剂的排期问题,我们将这些问题纳入半市场化的附带奖励(Makespan)和学习调度网,这是分散决策政策,可以有效协调多个代理商完成任务,表Net的决策程序包括:(1) 代表代理任务图中的排期问题状态,(2) 利用识别型图关注(TGA),为代理和任务节点提取节点嵌入节点,代理商和任务之间的重要关系信息,以及(3) 计算与计算节点嵌入的派任概率。我们验证了调度网作为解决各类多试剂排期任务,包括多销售员旅行问题和工作商店排期问题(JSP)的一般学习型排期的有效性。