Recent studies in using deep reinforcement learning (DRL) to solve Job-shop scheduling problems (JSSP) focus on construction heuristics. However, their performance is still far from optimality, mainly because the underlying graph representation scheme is unsuitable for modeling partial solutions at each construction step. This paper proposes a novel DRL-based method to learn improvement heuristics for JSSP, where graph representation is employed to encode complete solutions. We design a Graph Neural Network based representation scheme, consisting of two modules to effectively capture the information of dynamic topology and different types of nodes in graphs encountered during the improvement process. To speed up solution evaluation during improvement, we design a novel message-passing mechanism that can evaluate multiple solutions simultaneously. Extensive experiments on classic benchmarks show that the improvement policy learned by our method outperforms state-of-the-art DRL-based methods by a large margin.
翻译:最近关于利用深强化学习(DRL)解决工匠时间安排问题的研究(JSSP)侧重于建筑图理学,但其绩效仍远非最佳,主要是因为基本图形代表方案不适合在每一施工步骤中模拟部分解决方案,本文件建议采用基于新颖的DRL方法为JSSP学习改进超常技术,其中采用图形代表法来编码完整的解决方案。我们设计了一个基于图形神经网络代表方案,由两个模块组成,以有效捕捉在改进过程中遇到的图表中动态地形学和不同类型节点的信息。为了在改进过程中加快解决方案评价,我们设计了一个新的信息传递机制,可以同时评价多个解决方案。关于典型基准的大规模实验表明,我们方法所学的改进政策在大范围内优于基于最新技术的DRL方法。