Network data, commonly used throughout the physical, social, and biological sciences, consist of nodes (individuals) and the edges (interactions) between them. One way to represent the complex, high-dimensional structure in network data is to embed the graph into a low-dimensional geometric space. Curvature of this space, in particular, provides insights about structure in the graph, such as the propensity to form triangles or present tree-like structure. We derive an estimating function for curvature based on triangle side lengths and the midpoints between sides where the only input is a distance matrix and also establish asymptotic normality. We next introduce a novel latent distance matrix estimator for networks as well as an efficient algorithm to compute the estimate via solving iterative quadratic programs. We apply this method to the Los Alamos National Laboratory Unified Network and Host dataset and show how curvature estimates can be used to detect a red-team attack faster than naive methods, as well as discover non-constant latent curvature in coauthorship networks in physics.
翻译:在整个物理、社会和生物科学中常用的网络数据由节点(个人)和它们之间的边缘(互动)组成。在网络数据中代表复杂、高维结构的一种方式是将图形嵌入一个低维几何空间。特别是,该空间的曲线提供了对图中结构的洞察力,例如形成三角形的倾向或目前的树形结构。我们根据三角侧长和两侧之间的中点得出一个曲线估计功能,其中唯一的输入为距离矩阵,并同时建立无症状的正常状态。我们接下来为网络引入一个新的潜伏的距离矩阵估计仪,以及一种高效的算法,通过解析迭代二次方形程序来计算估计数。我们将这种方法应用于洛斯阿拉莫斯国家实验室统一网络和主机数据集,并展示如何使用曲线估计来比天性方法更快地探测红队攻击,以及发现物理学中枢网络中不相容的潜在曲线。