Detecting change-points in data is challenging because of the range of possible types of change and types of behaviour of data when there is no change. Statistically efficient methods for detecting a change will depend on both of these features, and it can be difficult for a practitioner to develop an appropriate detection method for their application of interest. We show how to automatically generate new detection methods based on training a neural network. Our approach is motivated by many existing tests for the presence of a change-point being able to be represented by a simple neural network, and thus a neural network trained with sufficient data should have performance at least as good as these methods. We present theory that quantifies the error rate for such an approach, and how it depends on the amount of training data. Empirical results show that, even with limited training data, its performance is competitive with the standard CUSUM test for detecting a change in mean when the noise is independent and Gaussian, and can substantially outperform it in the presence of auto-correlated or heavy-tailed noise. Our method also shows strong results in detecting and localising changes in activity based on accelerometer data.
翻译:检测数据的变化点具有挑战性,因为可能的变化类型和在没有变化的情况下数据的行为类型各不相同。在统计上,检测变化的有效方法取决于这两个特点,而从业者很难为应用这些特点开发出适当的检测方法。我们展示了如何在培训神经网络的基础上自动生成新的检测方法。我们的方法受到许多现有测试的驱动,这些测试表明,一个变化点能够由一个简单的神经网络代表,因此,受过足够数据培训的神经网络的性能至少应该与这些方法一样好。我们提出的理论是,对这种方法的误差率进行量化,以及它如何取决于培训数据的数量。实证结果表明,即使培训数据有限,其性能也与标准的CUSUM测试具有竞争力,可以检测在噪音独立和高斯语时的平均值变化,而且如果存在与自动计算机或重成型的噪声,则大大超出它的性能。我们的方法还表明,在检测和定位基于加速计的数据的活动变化方面,其结果也很强。