The study of statistical models of network structure, pursued across numerous disciplines and contexts, is fundamentally challenging because of (often high-order) dependence between connections. A common approach assigns each person in the graph to a position on a low-dimensional manifold. Distance between individuals in this (latent) space is inversely proportional to the likelihood of forming a connection. The choice of the latent geometry (the manifold class, dimension, and curvature) has consequential impacts on the substantive conclusions drawn from the model. More positive curvature in the manifold, for example, encourages more and tighter communities; negative curvature induces repulsion among nodes. Currently, however, the choice of the latent geometry is an a priori modeling assumption and there is limited guidance about how to make these choices in a data-driven way. In this work, we present a method to consistently estimate the manifold type, dimension, and curvature from an empirically relevant class of latent spaces: simply connected, complete Riemannian manifolds of constant curvature. Our core insight comes by representing the graph as a noisy distance matrix based on the ties between groups of nodes: either cliques, or in the case where the researcher observes traits, trait-groups. Leveraging results from statistical geometry, we develop hypothesis tests to determine whether the observed distances could plausibly be embedded isometrically in each of the candidate geometries. The method applies when the researcher observes the full graph and also to empirically relevant cases where only partial data is observed. We explore the accuracy of our approach with simulations and then apply our approach to data-sets from economics and sociology as well as neuroscience.
翻译:跨多个学科和背景的网络结构统计模型研究具有根本性的挑战性,因为(通常是高阶的)连接之间的依赖性。 一种共同的方法将图表中的每个人指定为低维方形的方位。 这个( 相对的) 空间中个人之间的距离与形成连接的可能性成反比。 潜在几何( 多重类、 尺寸和曲线) 的选择会影响从模型中得出的实质性结论。 多个元体中更积极的曲线化, 例如鼓励更多和更紧密的社区; 负曲度使节点间反弹。 然而, 目前, 暗地测地法的选择是一种先行模型的假设, 而对于如何以数据驱动的方式做出这些选择, 个人之间的距离是有限的指导。 在这项工作中, 我们提出一个方法, 从一个与实验相关的隐性空间类别( 简单的连结, 完整的里曼级方法, 我们不断的曲线法方法。 我们的核心洞察力来自图表, 代表一个以精确的距离矩阵矩阵表, 其基础是不同节点之间的关联, 部分的测地度假设是我们所观察到的精确度, 将每个研究所测测的模型中, 的测为我们所观察到的测测测的 。