A general framework of unsupervised learning for combinatorial optimization (CO) is to train a neural network (NN) whose output gives a problem solution by directly optimizing the CO objective. Albeit with some advantages over traditional solvers, the current framework optimizes an averaged performance over the distribution of historical problem instances, which misaligns with the actual goal of CO that looks for a good solution to every future encountered instance. With this observation, we propose a new objective of unsupervised learning for CO where the goal of learning is to search for good initialization for future problem instances rather than give direct solutions. We propose a meta-learning-based training pipeline for this new objective. Our method achieves good empirical performance. We observe that even just the initial solution given by our model before fine-tuning can significantly outperform the baselines under various evaluation settings including evaluation across multiple datasets, and the case with big shifts in the problem scale. The reason we conjecture is that meta-learning-based training lets the model be loosely tied to each local optima for a training instance while being more adaptive to the changes of optimization landscapes across instances.
翻译:未经监督的组合优化学习总框架(CO)是培训神经网络(NN),其产出通过直接优化CO目标而产生问题的解决办法。尽管现有框架对传统解决问题者有一些优势,但相对于历史问题案例的分布而言,其平均性能优化,这与CO的实际目标不相符,为每一个未来遇到的事例寻找一个良好的解决方案。通过这一观察,我们提出了一个未经监督的CO学习新目标,其学习目标是为未来的问题案例寻找良好的初始化而不是直接解决方案。我们为这一新的目标提出了一个基于元学习的培训管道。我们的方法实现了良好的实证业绩。我们观察到,即使我们模型在微调之前给出的初步性解决方案也大大超过各种评估环境中的基线,包括跨多个数据集的评价,以及问题规模发生巨大变化的案例。我们推测基于元学习的培训使模型与每个本地的选案紧密地联系在一起,同时更适应各个实例的优化景观变化。