为混合整数程序学习大型邻里搜索算法 (Learning a Large Neighborhood Search Algorithm for Mixed Integer Programs)

Large Neighborhood Search (LNS) is a combinatorial optimization heuristic that starts with an assignment of values for the variables to be optimized, and iteratively improves it by searching a large neighborhood around the current assignment. In this paper we consider a learning-based LNS approach for mixed integer programs (MIPs). We train a Neural Diving model to represent a probability distribution over assignments, which, together with an existing MIP solver, generates an initial assignment. Formulating the subsequent search steps as a Markov Decision Process, we train a Neural Neighborhood Selection policy to select a search neighborhood at each step, which is searched using a MIP solver to find the next assignment. The policy network is trained using imitation learning. We propose a target policy for imitation that, given enough compute resources, is guaranteed to select the neighborhood containing the optimal next assignment across all possible choices for the neighborhood of a specified size. Our approach matches or outperforms all the baselines on five real-world MIP datasets with large-scale instances from diverse applications, including two production applications at Google. At large running times it achieves $2\times$ to $37.8\times$ better average primal gap than the best baseline on three of the datasets.

翻译：大型邻里搜索( LNS) 是一个组合式优化优化策略, 首先是为要优化的变量分配值, 并且通过搜索当前任务周围的大街区来迭代改进它。在本文中, 我们考虑对混合整数程序( MIPs) 采取基于学习的 LNS 方法。我们训练神经下游模型, 以代表任务之间的概率分布, 与现有的 MIP 求解器一起生成初始任务。将随后的搜索步骤配置为 Markov 决策程序, 我们训练一个神经邻里选择政策, 以在每步选择一个搜索区, 使用 MIP 求解答器搜索下一个任务。政策网络通过模拟学习来接受培训。我们建议了一个模拟目标政策, 在有足够精密的资源的情况下, 可以选择包含所有可能选择的指定区域的最佳任务。我们的方法匹配或优于五个真实世界的 MIP 数据集的所有基线, 以及来自不同应用程序的大型实例, 包括两个 Google 的生成量应用程序。政策网络将使用模拟学习模拟学习模拟学习模拟学习学习模拟学习学习学习学习学习学习学习。我们提出了平均 3 时间。