We present a novel neural architecture to solve graph optimization problems where the solution consists of arbitrary node labels, allowing us to solve hard problems like graph coloring. We train our model using reinforcement learning, specifically policy gradients, which gives us both a greedy and a probabilistic policy. Our architecture builds on a graph attention network and uses several inductive biases to improve solution quality. Our learned deterministic heuristics for graph coloring give better solutions than classical degree-based greedy heuristics and only take seconds to apply to graphs with tens of thousands of vertices. Moreover, our probabilistic policies outperform all greedy state-of-the-art coloring baselines and a machine learning baseline. Finally, we show that our approach also generalizes to other problems by evaluating it on minimum vertex cover and outperforming two greedy heuristics.
翻译:我们提出了一个新颖的神经结构来解决图形优化问题,其解决方案包括任意的节点标签,从而使我们能够解决像图形色化这样的难题。我们用强化学习,特别是政策梯度来培训我们的模型,这使我们既贪婪又有概率性的政策。我们的建筑建在一个图形关注网络上,并使用几种感知偏差来提高解决方案的质量。我们所学得的图形色化的确定论比传统的基于度的贪婪嗜血症更好的解决方案,只花几秒钟来应用有数万个脊椎的图表。此外,我们的概率政策超越了所有贪婪的艺术水平的颜色基线和机器学习基线。最后,我们展示了我们的方法也通过在最低限度的脊椎覆盖上评价它并超越了两种贪婪的脂质来概括其它问题。