Two-sample tests utilizing a similarity graph on observations are useful for high-dimensional and non-Euclidean data due to their flexibility and good performance under a wide range of alternatives. Existing works mainly focused on sparse graphs, such as graphs with the number of edges in the order of the number of observations, and their asymptotic results imposed strong conditions on the graph that can easily be violated by commonly constructed graph they suggested. Moreover, the tests have better performance with denser graphs under many settings. In this work, we establish the theoretical ground for graph-based tests with graphs ranging from those recommended in current literature to much denser ones.
翻译:使用类似观测图的两样抽样测试对高维和非欧洲-太平洋数据有用,因为它们具有灵活性,在多种替代品下表现良好。现有工作主要侧重于稀薄的图表,如按观测次数顺序排列的边缘数的图表,以及其无症状结果对图施加的强烈条件,而这些条件很容易被共同构造的图表所违反。此外,这些测试在许多环境中与密度更高的图表相比效果更好。在这项工作中,我们用从现有文献所建议的图表到密度较大的图表,为基于图形的测试建立了理论基础。