Learning-based heuristics for solving combinatorial optimization problems has recently attracted much academic attention. While most of the existing works only consider the single objective problem with simple constraints, many real-world problems have the multiobjective perspective and contain a rich set of constraints. This paper proposes a multiobjective deep reinforcement learning with evolutionary learning algorithm for a typical complex problem called the multiobjective vehicle routing problem with time windows (MO-VRPTW). In the proposed algorithm, the decomposition strategy is applied to generate subproblems for a set of attention models. The comprehensive context information is introduced to further enhance the attention models. The evolutionary learning is also employed to fine-tune the parameters of the models. The experimental results on MO-VRPTW instances demonstrate the superiority of the proposed algorithm over other learning-based and iterative-based approaches.
翻译:最近,解决组合优化问题的基于学习的理论理论最近引起了许多学术上的注意。虽然大多数现有著作只考虑单一的客观问题和简单的限制,但许多现实世界的问题都具有多客观的视角,并包含一系列丰富的制约因素。本文件建议对典型的复杂问题进行多方面的深层次强化学习,即:与时间窗口(MO-VRPTW)的多客观车辆路线问题。在拟议的算法中,分解战略用于产生一套关注模式的子问题。引入全面的背景信息是为了进一步加强关注模式。进化学习还被用来微调模型的参数。MO-VRPTW实例的实验结果表明,拟议的算法优于其他基于学习和多层次的方法。