We consider the problem of detecting distributional changes in a sequence of high dimensional data. Our approach combines two separate statistics stemming from $L_p$ norms whose behavior is similar under $H_0$ but potentially different under $H_A$, leading to a testing procedure that that is flexible against a variety of alternatives. We establish the asymptotic distribution of our proposed test statistics separately in cases of weakly dependent and strongly dependent coordinates as $\min\{N,d\}\to\infty$, where $N$ denotes sample size and $d$ is the dimension, and establish consistency of testing and estimation procedures in high dimensions under one-change alternative settings. Computational studies in single and multiple change point scenarios demonstrate our method can outperform other nonparametric approaches in the literature for certain alternatives in high dimensions. We illustrate our approach though an application to Twitter data concerning the mentions of U.S. Governors.
翻译:我们考虑了在一系列高维数据中检测分布变化的问题。我们的方法将来自美元-p美元规范的两种单独统计数据结合起来,这些规范的行为在H$$下类似,但在H$美元下可能不同,导致一种针对各种替代物的灵活测试程序。我们将我们提议的测试统计数据在依赖性弱和高度依赖性强的坐标情况下的零星分布分别确定为$\min ⁇ N,d ⁇ to\inty$,其中N$表示样本大小,$d$为维度,并在一变替代物环境下确定高维度测试和估算程序的一致性。单一和多变点假设的计算研究表明,我们的方法可以超越文献中某些高维量替代品的其他非参数性方法。我们通过对Twitter数据的应用来说明我们的方法。