We propose a framework to learn to schedule a job-shop problem (JSSP) using a graph neural network (GNN) and reinforcement learning (RL). We formulate the scheduling process of JSSP as a sequential decision-making problem with graph representation of the state to consider the structure of JSSP. In solving the formulated problem, the proposed framework employs a GNN to learn that node features that embed the spatial structure of the JSSP represented as a graph (representation learning) and derive the optimum scheduling policy that maps the embedded node features to the best scheduling action (policy learning). We employ Proximal Policy Optimization (PPO) based RL strategy to train these two modules in an end-to-end fashion. We empirically demonstrate that the GNN scheduler, due to its superb generalization capability, outperforms practically favored dispatching rules and RL-based schedulers on various benchmark JSSP. We also confirmed that the proposed framework learns a transferable scheduling policy that can be employed to schedule a completely new JSSP (in terms of size and parameters) without further training.
翻译:我们提出一个框架,以利用图表神经网络(GNN)和强化学习(RL)来学习安排一个就业部门问题(JSSP),我们将JSSP的时间安排进程作为一个顺序决策问题,以图表形式代表国家考虑JSSP的结构。在解决所提出的问题时,拟议框架采用GNN来了解将JSSP的空间结构嵌入为图表(代表性学习)的节点特征作为最佳时间安排政策,将嵌入节点与最佳时间安排行动(政策学习)相匹配。我们采用基于PRXimal政策优化(PPO)战略,以端至端方式培训这两个模块。我们从经验上证明,GNN的时间安排由于超常化能力,实际上优于发送规则,在各种基准JSSP上基于RL的时间安排者。我们还确认,拟议框架学习了可转让的时间安排政策,可以用来在不进一步培训的情况下,为完全新的JSSP(规模和参数)安排一个可转让的时间安排。