Combinatorial optimization problems (COPs) on the graph with real-life applications are canonical challenges in Computer Science. The difficulty of finding quality labels for problem instances holds back leveraging supervised learning across combinatorial problems. Reinforcement learning (RL) algorithms have recently been adopted to solve this challenge automatically. The underlying principle of this approach is to deploy a graph neural network (GNN) for encoding both the local information of the nodes and the graph-structured data in order to capture the current state of the environment. Then, it is followed by the actor to learn the problem-specific heuristics on its own and make an informed decision at each state for finally reaching a good solution. Recent studies on this subject mainly focus on a family of combinatorial problems on the graph, such as the travel salesman problem, where the proposed model aims to find an ordering of vertices that optimizes a given objective function. We use the security-aware phone clone allocation in the cloud as a classical quadratic assignment problem (QAP) to investigate whether or not deep RL-based model is generally applicable to solve other classes of such hard problems. Extensive empirical evaluation shows that existing RL-based model may not generalize to QAP.
翻译:图形中的组合优化问题(COPs)与现实应用是计算机科学的典型挑战。在问题案例中寻找质量标签的难度使问题案例的定性标签无法在组合问题中起到杠杆作用。最近采用了强化学习(RL)算法来自动解决这一挑战。这个方法的基本原则是部署一个图形神经网络(GNN),将节点和图形结构数据的地方信息编码,以便捕捉环境的当前状态。随后,行为者将自身了解问题特有的超常性,并在每个州作出知情的决定,以便最终找到一个好的解决办法。最近关于这个主题的研究主要侧重于图表上的组合问题,例如旅行销售员问题。 拟议的模型的目的是找到一个能够优化给定目标功能的脊椎的排序。我们用在云中的安全觉的电话克隆分配作为典型的二次分配问题。我们用它来调查基于深度RL模型是否普遍适用于解决其他类别的难题。Q- 广泛的经验性评估显示,现有的RAP模型可能无法普遍解决其他类别的难题。Q- 广泛的经验性评估显示,可以显示现有的RAP。