Many data analysis problems can be cast as distance geometry problems in \emph{space forms} -- Euclidean, elliptic, or hyperbolic spaces. We ask: what can be said about the underlying space form if we are only given a subset of comparisons between pairwise distances, without computing an actual embedding? To study this question, we define the \textit{ordinal capacity} of a metric space. Ordinal capacity measures how well a space can accommodate a given set of ordinal measurements. We prove that the ordinal capacity of a space form is related to its dimension and curvature sign, and provide a lower bound on the embedding dimension of non-metric graphs in terms of the \textit{ordinal spread} of their sub-cliques. Finally, we show how the statistics of ordinal spread can be used to identify the underlying space form of similarity graphs on weighted trees and gene expression data.
翻译:许多数据分析问题可以被描绘为 emph{ space forms} -- Euclidean, extliptic, 或双曲空格中的距离几何学问题。 我们问: 如果只给出对称距离之间比较的子集,而不计算实际嵌入, 基础空间的形式可以说什么? 为了研究这个问题, 我们定义了一个公制空间的\ textit{ ordinal 能力 。 常规能力测量一个空间能够容纳特定一组星系测量数据有多好。 我们证明, 空间形态的方位能力与其尺寸和曲线符号相关, 并且提供了非度图的嵌入尺寸的下限, 也就是其子组的\ textit{ ordinal 扩展} 。 最后, 我们展示了如何使用星系分布的统计来识别加权树木和基因表达数据的类似图形的基本空间形式 。