Change-point analysis plays a significant role in various fields to reveal discrepancies in distribution in a sequence of observations. While a number of algorithms have been proposed for high-dimensional data, kernel-based methods have not been well explored due to difficulties in controlling false discoveries and mediocre performance. In this paper, we propose a new kernel-based framework that makes use of an important pattern of data in high dimensions to boost power. Analytic approximations to the significance of the new statistics are derived and fast tests based on the asymptotic results are proposed, offering easy off-the-shelf tools for large datasets. The new tests show superior performance for a wide range of alternatives when compared with other state-of-the-art methods. We illustrate these new approaches through an analysis of a phone-call network data.
翻译:变化点分析在各个领域起着重要作用,以揭示一系列观测在分布上的差异。虽然为高维数据提出了若干算法,但由于难以控制虚假发现和中等性能,以内核为基础的方法没有得到很好探讨。在本文件中,我们提议一个新的内核框架,利用高维的重要数据模式来增强动力。对新统计数据的重要性进行了分析性近比,并提议根据无现成的结果进行快速测试,为大型数据集提供了方便的现成工具。新的测试显示,与其他最先进的方法相比,各种替代品的性能优异。我们通过分析电话网络数据来说明这些新方法。