In this paper, we propose a deep reinforcement learning framework called GCOMB to learn algorithms that can solve combinatorial problems over large graphs. GCOMB mimics the greedy algorithm in the original problem and incrementally constructs a solution. The proposed framework utilizes Graph Convolutional Network (GCN) to generate node embeddings that predicts the potential nodes in the solution set from the entire node set. These embeddings enable an efficient training process to learn the greedy policy via Q-learning. Through extensive evaluation on several real and synthetic datasets containing up to a million nodes, we establish that GCOMB is up to 41% better than the state of the art, up to seven times faster than the greedy algorithm, robust and scalable to large dynamic networks.
翻译:在本文中,我们提出一个称为GCOMB的深层强化学习框架,以学习能够解决大图组合问题的算法。 GCOMB模仿原始问题中的贪婪算法,并逐步构建一个解决方案。拟议框架利用图集网络生成节点嵌入器,预测整个节点设置的解决方案中的潜在节点。这些嵌入使通过Q学习学习学习来学习贪婪政策的高效培训过程得以进行。通过对包含多达100万节点的若干真实和合成数据集进行广泛评估,我们确定GCOMB比艺术水平高出41 %, 比贪婪算法快7倍,对大型动态网络来说是强大和可缩放的。