In this article, we propose a two-sample test for functional observations modeled as elements of a separable Hilbert space. We present a general recipe for constructing a measure of dissimilarity between the distributions of two Hilbertian random variables and study the theoretical properties of one such measure which is constructed using Maximum Mean Discrepancy (MMD) on random linear projections of the distributions and aggregating them. We propose a data-driven estimate of this measure and use it as the test statistic. Large sample distributions of this statistic are derived both under null and alternative hypotheses. This test statistic involves a kernel function and the associated bandwidth. We prove that the resulting test has large-sample consistency for any data-driven choice of bandwidth that converges in probability to a positive number. Since the theoretical quantiles of the limiting null distribution are intractable, in practice, the test is calibrated using the permutation method. We also derive the limiting distribution of the permuted test statistic and the asymptotic power of the permutation test under local contiguous alternatives. This shows that the permutation test is consistent and statistically efficient in the Pitman sense. Extensive simulation studies are carried out and a real data set is analyzed to compare the performance of our proposed test with some state-of-the-art methods.
翻译:在这篇文章中,我们提出了一种用于函数观测值的两个样本测试,这些观测值被建模为可分Hilbert空间的元素。我们提出了一种构建两个Hilbert随机变量分布差异度量的通用方法,并研究了一种使用分布的最大均值差异度量(MMD)构建的度量之一的理论特性。我们提出了这种度量的数据驱动估计,将其用作检验统计量。在零假设和备择假设下导出了这个统计量的大样本分布。这个检验统计量涉及一个核函数和相关的带宽。我们证明了一个数据驱动带宽的选择可以以概率收敛到一个正数,从而得到的测试具有大样本一致性。由于极限空值分布的理论分位数难以计算,因此在实践中,使用置换方法进行校准。我们还导出了置换测试统计量的极限分布和在局部连续备择假设下的渐近功率。这表明置换测试在Pitman意义下是一致和统计有效的。我们进行了大量的模拟研究,并分析了一个真实数据集,比较了我们提出的测试方法和一些最先进的方法的表现。