Deep learning approaches have shown promising results in solving routing problems. However, there is still a substantial gap in solution quality between machine learning and operations research algorithms. Recently, another line of research has been introduced that fuses the strengths of machine learning and operational research algorithms. In particular, search perturbation operators have been used to improve the solution. Nevertheless, using the perturbation may not guarantee a quality solution. This paper presents "Learning to Guide Local Search" (L2GLS), a learning-based approach for routing problems that uses a penalty term and reinforcement learning to adaptively adjust search efforts. L2GLS combines local search (LS) operators' strengths with penalty terms to escape local optimals. Routing problems have many practical applications, often presetting larger instances that are still challenging for many existing algorithms introduced in the learning to optimise field. We show that L2GLS achieves the new state-of-the-art results on larger TSP and CVRP over other machine learning methods.
翻译:深层学习方法在解决路由问题方面显示出了令人乐观的结果,然而,机器学习和操作研究算法之间在解决办法质量方面仍然存在巨大差距。最近,又引入了另一系列研究,将机器学习和操作研究算法的优势结合在一起。特别是,搜索扰动操作员被用来改进解决办法。然而,使用扰动操作器可能无法保证高质量的解决办法。本文展示了“学习引导本地搜索”(L2GLS),这是一种以学习为基础的方法,用于解决使用惩罚术语的路径问题,并加强学习适应性调整搜索努力。L2GLS将本地搜索操作员的优势与惩罚条件结合起来,以逃避当地的最佳方法。 脱机问题有许多实际应用,往往预设了对于学习优化字段中引入的许多现有算法仍然具有挑战性的更大实例。我们显示,L2GLS在大型TSP和CVRP上取得了新的最新结果,而不是其他机器学习方法。