使用深强化学习进行 Entro 最大化的动态网络重组 (Dynamic Network Reconfiguration for Entropy Maximization using Deep Reinforcement Learning)

A key problem in network theory is how to reconfigure a graph in order to optimize a quantifiable objective. Given the ubiquity of networked systems, such work has broad practical applications in a variety of situations, ranging from drug and material design to telecommunications. The large decision space of possible reconfigurations, however, makes this problem computationally intensive. In this paper, we cast the problem of network rewiring for optimizing a specified structural property as a Markov Decision Process (MDP), in which a decision-maker is given a budget of modifications that are performed sequentially. We then propose a general approach based on the Deep Q-Network (DQN) algorithm and graph neural networks (GNNs) that can efficiently learn strategies for rewiring networks. We then discuss a cybersecurity case study, i.e., an application to the computer network reconfiguration problem for intrusion protection. In a typical scenario, an attacker might have a (partial) map of the system they plan to penetrate; if the network is effectively "scrambled", they would not be able to navigate it since their prior knowledge would become obsolete. This can be viewed as an entropy maximization problem, in which the goal is to increase the surprise of the network. Indeed, entropy acts as a proxy measurement of the difficulty of navigating the network topology. We demonstrate the general ability of the proposed method to obtain better entropy gains than random rewiring on synthetic and real-world graphs while being computationally inexpensive, as well as being able to generalize to larger graphs than those seen during training. Simulations of attack scenarios confirm the effectiveness of the learned rewiring strategies.

翻译：网络理论中的一个关键问题是,如何重新配置图表,以优化可量化的目标。鉴于网络化系统的普及性,这项工作在多种情况下,从药物和材料设计到电信,都有广泛的实际应用,从毒品和材料设计到电信等多种情况。但是,由于可能进行重组的庞大决策空间,使得这一问题在计算上变得十分密集。在本文中,我们将网络的重新布线问题作为Markov 决策程序(MDP) 来优化特定的结构属性。在这个过程中,决策者得到一个按顺序进行修改的预算。我们然后根据深Q-Network(DQN)的随机算法和精度精度精度神经网络(GNNS)提出一个总体方法,可以有效地学习网络的重新布线战略。我们可以将网络的网络安全性研究,即计算机网络重组问题的应用作为入侵保护。在典型情况下,攻击者可能拥有他们计划渗透的系统(部分)地图;如果网络有效“调整”而不是按顺序进行修改,那么他们就无法在它们之前的知识已经过时的情况下对它进行操作。我们可以将这种精度的精度的精度的精度测测测的精度测量网络的精度网络的精度网络测量战略视为精度,而正在提高网络的精度的精度,而使网络的精度成为精度的精度的精度的精度的精度的精度的精度的精度,而成为更深度的精度的精度的精度的精度的精度的精度的精度的精度的精度。我们进进进进进进进进进进进进进度。我们进进度。我们进进度,我们进进进进进进进进进进进进进进进进的进进进进进进进进进进的进进进进进进进的进进的进的进的进的进的进的进的进的进的进的进进的进进的进的进进进进进进进进进进进进进进进进进进进进进进进进进进进进进进进进进进进的进进进进进进进进的进的进进进进进进进进进进进进进进进进进进进进进进进进进进进进进进进进进

相关内容

Networking

关注 0

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日