利用深强化学习在适应性大型邻居搜索中选择操作员 (Operator Selection in Adaptive Large Neighborhood Search using Deep Reinforcement Learning)

Large Neighborhood Search (LNS) is a popular heuristic for solving combinatorial optimization problems. LNS iteratively explores the neighborhoods in solution spaces using destroy and repair operators. Determining the best operators for LNS to solve a problem at hand is a labor-intensive process. Hence, Adaptive Large Neighborhood Search (ALNS) has been proposed to adaptively select operators during the search process based on operator performances of the previous search iterations. Such an operator selection procedure is a heuristic, based on domain knowledge, which is ineffective with complex, large solution spaces. In this paper, we address the problem of selecting operators for each search iteration of ALNS as a sequential decision problem and propose a Deep Reinforcement Learning based method called Deep Reinforced Adaptive Large Neighborhood Search. As such, the proposed method aims to learn based on the state of the search which operation to select to obtain a high long-term reward, i.e., a good solution to the underlying optimization problem. The proposed method is evaluated on a time-dependent orienteering problem with stochastic weights and time windows. Results show that our approach effectively learns a strategy that adaptively selects operators for large neighborhood search, obtaining competitive results compared to a state-of-the-art machine learning approach while trained with much fewer observations on small-sized problem instances.

翻译：大型邻里搜索( LNS) 是用来解决组合优化问题的流行杂交。 LNS 使用破坏和修理操作员来反复探索解决方案空间中的邻里。确定 LNS 最佳操作员来解决手头问题是一个劳动密集型过程。因此, 推荐适应性大邻里搜索( ALNS) 给在搜索过程中根据先前搜索迭代操作员的操作员表现进行适应性选择的操作员。这种操作员选择程序是一种超常, 以域知识为基础, 与复杂、大型解决方案空间不起作用。在本文中, 我们解决了为每次搜索 ALNS 的循环选择操作员的问题, 将其作为一个连续决定问题。确定 LNS 最佳操作员解决手头问题的方法是一个劳动密集型的过程。因此, 拟议的方法旨在根据搜索过程的状态进行适应性选择, 以获得高长期的奖励。也就是说, 对潜在的优化问题有一个良好的解决方案。拟议的方法是用一个时间依赖性或方向性的问题来评估, 与精细的观测器进行每次搜索, 进行连续的重复的操作员,, 并提出深思广的学习方法,, 并同时学习一个高超小的智能搜索方法,, 学习一个大的机器学习,, 学习学习一个大的,,, 学习学习并学习一个有竞争力的,,,, 并且,,, 学习学习,,,,,,, 以以以快速学习学习,, 的,,, 学习, 的,, 学习学习学习, 学习学习的的,,,,,,,,,,,,,,,,,, 的,,,,,,, 和,,,, 快速, 和,,,,,,,,,,,,,,, 学习,, 和和,,,,, 学习, 学习学习学习