Recent neural methods for vehicle routing problems always train and test the deep models on the same instance distribution (i.e., uniform). To tackle the consequent cross-distribution generalization concerns, we bring the knowledge distillation to this field and propose an Adaptive Multi-Distribution Knowledge Distillation (AMDKD) scheme for learning more generalizable deep models. Particularly, our AMDKD leverages various knowledge from multiple teachers trained on exemplar distributions to yield a light-weight yet generalist student model. Meanwhile, we equip AMDKD with an adaptive strategy that allows the student to concentrate on difficult distributions, so as to absorb hard-to-master knowledge more effectively. Extensive experimental results show that, compared with the baseline neural methods, our AMDKD is able to achieve competitive results on both unseen in-distribution and out-of-distribution instances, which are either randomly synthesized or adopted from benchmark datasets (i.e., TSPLIB and CVRPLIB). Notably, our AMDKD is generic, and consumes less computational resources for inference.
翻译:最近的车辆路由问题神经系统方法总是在相同的分布模式(即统一)上培训和测试深度模型(即统一)。为了解决随之产生的跨分布的概括问题,我们把知识蒸馏到这个领域,并提出适应性多分布知识蒸馏(AMDKD)计划,以学习更普遍化的深层模型。特别是,我们的AMDKD利用从受过模拟分布培训的多位教师获得的各种知识,形成一个轻量级但通用的学生模式。与此同时,我们为AMDKD配备了适应性战略,使学生能够集中精力进行困难的分布,从而更有效地吸收难以掌握的知识。广泛的实验结果显示,与基线神经方法相比,我们的AMDKD能够在不可见的分布和分配之外实例上取得竞争性的结果,这些实例要么是随机合成的,要么是从基准数据集(即,TSPLIB和CVRPLIB)中采纳的。 值得注意的是,我们的AMDD是通用的,并且用较少的计算资源来推断。