Two-sample tests utilizing a similarity graph on observations are useful for high-dimensional data and non-Euclidean data due to their flexibility and good performance under a wide range of alternatives. Existing works mainly focused on sparse graphs, such as graphs with the number of edges in the order of the number of observations. However, the tests have better performance with denser graphs under many settings. In this work, we establish the theoretical ground for graph-based tests with graphs that are much denser than those in existing works.
翻译:使用类似观测图的两样抽样测试,对于高维数据和非欧洲-太平洋数据是有用的,因为它们具有灵活性,在多种替代方法下表现良好,现有工作主要侧重于稀薄的图表,例如按观测数的顺序排列的边缘的图表。然而,在许多环境中,这些测试与密度较大的图表相比效果更好。在这项工作中,我们用比现有工程更稠密的图表为基于图形的测试建立了理论基础。