Deep reinforcement learning (DRL) has recently shown its success in tackling complex combinatorial optimization problems. When these problems are extended to multiobjective ones, it becomes difficult for the existing DRL approaches to flexibly and efficiently deal with multiple subproblems determined by weight decomposition of objectives. This paper proposes a concise meta-learning-based DRL approach. It first trains a meta-model by meta-learning. The meta-model is fine-tuned with a few update steps to derive submodels for the corresponding subproblems. The Pareto front is built accordingly. The computational experiments on multiobjective traveling salesman problems demonstrate the superiority of our method over most of learning-based and iteration-based approaches.
翻译:深入强化学习(DRL)最近展示了在解决复杂的组合优化问题方面的成功。当这些问题扩展到多目标问题时,现有的DRL方法很难灵活和有效地处理由目标重量分解决定的多重子问题。本文提出一个简明的基于元学习的DRL方法。它首先通过元学习来培训一个元模型。元模型经过微调,采取一些更新步骤,为相应的子问题制作子模型。Pareto前方也相应建立。关于多目标旅行销售人员的计算实验表明,我们的方法优于大多数基于学习和迭代的方法。