Two-sample tests utilizing a similarity graph on observations are useful for high-dimensional and non-Euclidean data due to their flexibility and good performance under a wide range of alternatives. Existing works mainly focused on sparse graphs, such as graphs with the number of edges in the order of the number of observations, and their asymptotic results imposed strong conditions on the graph that can easily be violated by commonly constructed graphs they suggested. Moreover, the graph-based tests have better performance with denser graphs under many settings. In this work, we establish the theoretical ground for graph-based tests with graphs ranging from those recommended in current literature to much denser ones.
翻译:使用类似观测图的两类抽样测试,对高维和非欧化数据有用,因为它们具有灵活性,在多种替代方法下表现良好。现有工作主要侧重于稀薄的图表,如按观测次数顺序排列的边缘数图,其无症状结果对图施加了强烈条件,这些条件很容易被共同构造的图表所违反。此外,基于图形的测试在许多环境中使用密度更高的图表效果更好。在这项工作中,我们用从当前文献中推荐的图表到密度较大的图表,为基于图形的测试建立了理论基础。