Recently numerous machine learning based methods for combinatorial optimization problems have been proposed that learn to construct solutions in a sequential decision process via reinforcement learning. While these methods can be easily combined with search strategies like sampling and beam search, it is not straightforward to integrate them into a high-level search procedure offering strong search guidance. Bello et al. (2016) propose active search, which adjusts the weights of a (trained) model with respect to a single instance at test time using reinforcement learning. While active search is simple to implement, it is not competitive with state-of-the-art methods because adjusting all model weights for each test instance is very time and memory intensive. Instead of updating all model weights, we propose and evaluate three efficient active search strategies that only update a subset of parameters during the search. The proposed methods offer a simple way to significantly improve the search performance of a given model and outperform state-of-the-art machine learning based methods on combinatorial problems, even surpassing the well-known heuristic solver LKH3 on the capacitated vehicle routing problem. Finally, we show that (efficient) active search enables learned models to effectively solve instances that are much larger than those seen during training.
翻译:最近提出了许多基于机械学习的组合优化问题方法,这些方法可以学习通过强化学习在顺序决策过程中构建解决方案。这些方法可以很容易地与取样和光束搜索等搜索策略相结合,但将其纳入提供强有力的搜索指导的高级搜索程序并非简单易行。Bello 等人(2016) 提议积极搜索,以使用强化学习来调整一个(经过培训的)模型在测试时对单一实例的权重。虽然积极搜索是简单的,但与最先进的方法相比,它并不具有竞争力,因为对每个测试实例的所有模型权重进行调整是非常时间和记忆密集的。我们提议和评价三种有效的主动搜索战略,而不是更新所有模型权重,而只是在搜索过程中更新一系列参数。拟议方法提供了一种简单的方法,可以大大改进特定模型的搜索性能和超常规的机器学习,这些模型基于组合式问题的方法,甚至超过众所周知的超常识的超异性LKHH3,因为每个测试示例都是时间和记忆密集的。最后,我们表明(高效的)积极搜索使那些学习的模型能够有效地解决这些实例。