Multi-objective combinatorial optimization problems (MOCOPs), one type of complex optimization problems, widely exist in various real applications. Although meta-heuristics have been successfully applied to address MOCOPs, the calculation time is often much longer. Recently, a number of deep reinforcement learning (DRL) methods have been proposed to generate approximate optimal solutions to the combinatorial optimization problems. However, the existing studies on DRL have seldom focused on MOCOPs. This study proposes a single-model deep reinforcement learning framework, called multi-objective Pointer Network (MOPN), where the input structure of PN is effectively improved so that the single PN is capable of solving MOCOPs. In addition, two training strategies, based on representative model and transfer learning, respectively, are proposed to further enhance the performance of MOPN in different application scenarios. Moreover, compared to classical meta-heuristics, MOPN only consumes much less time on forward propagation to obtain the Pareto front. Meanwhile, MOPN is insensitive to problem scale, meaning that a trained MOPN is able to address MOCOPs with different scales. To verify the performance of MOPN, extensive experiments are conducted on three multi-objective traveling salesman problems, in comparison with one state-of-the-art model DRL-MOA and three classical multi-objective meta-heuristics. Experimental results demonstrate that the proposed model outperforms all the comparative methods with only 20\% to 40\% training time of DRL-MOA.
翻译:多目标组合优化问题(MOCOPs)是一种复杂的优化复杂问题,在各种实际应用中广泛存在。尽管已经成功地应用了超常理论解决MOCOPs,但计算时间往往要长得多。最近,提出了若干深度强化学习(DRL)方法,以产生对组合优化问题的大致最佳解决办法。然而,关于DRL的现有研究很少侧重于MOCOPs。这项研究建议建立一个单一模型深度强化学习框架,称为多目标点网络(MON),其中PN的输入结构得到有效改进,使单一PN能够解决MOCOPs。此外,还提出了两个分别基于有代表性的模式和转移学习的培训战略,以进一步提高MOPN在不同应用情景下的业绩。此外,与传统的超常超常模式相比,MOPN只花费更短的时间进行前期传播。同时,MOPN只对问题的规模不敏感,这意味着经过培训的MOPN能够以不同规模处理MOCOPs解决MOCOPs(MNPNs) 。此外,对MODRMA的三种正统alalal-LMA(ML) MOMA) 3号模型进行了广泛的实验,以核实MODRAM-S-S-S-S-S-S-S-S-DR-S-S-S-S-S-S-DR-S-S-S-S-S-S-S-S-S-S-S-S-S-SL-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-Siralvial-S-S-S-Siral-S-S-S-S-S-R-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-Siral-Siral-SL-R-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S