In recent years, change point detection for high dimensional data has become increasingly important in many scientific fields. Most literature develop a variety of separate methods designed for specified models (e.g. mean shift model, vector auto-regressive model, graphical model). In this paper, we provide a unified framework for structural break detection which is suitable for a large class of models. Moreover, the proposed algorithm automatically achieves consistent parameter estimates during the change point detection process, without the need for refitting the model. Specifically, we introduce a three-step procedure. The first step utilizes the block segmentation strategy combined with a fused lasso based estimation criterion, leads to significant computational gains without compromising the statistical accuracy in identifying the number and location of the structural breaks. This procedure is further coupled with hard-thresholding and exhaustive search steps to consistently estimate the number and location of the break points. The strong guarantees are proved on both the number of estimated change points and the rates of convergence of their locations. The consistent estimates of model parameters are also provided. The numerical studies provide further support of the theory and validate its competitive performance for a wide range of models. The developed algorithm is implemented in the R package LinearDetect.
翻译:近年来,在许多科学领域,对高维数据的改变点探测在许多科学领域变得日益重要。大多数文献都为特定模型设计了各种不同的方法(例如,平均转换模型、矢量自动递减模型、图形模型)。在本文件中,我们为结构断裂探测提供了一个统一的框架,适合于大类模型。此外,拟议的算法在改变点探测过程中自动得出一致的参数估计,而不需要调整模型。具体地说,我们引入了一个三步程序。第一步利用区块分割战略,加上一个基于综合的拉索估计标准,导致计算上的重大增益,同时不影响在确定结构断裂的次数和位置方面的统计准确性。这一程序进一步与硬持有和详尽搜索步骤相结合,以一致估计断裂点的数目和位置。在估计变化点数目及其位置汇合率方面都得到了有力的保证。还提供了一致的模型参数估计。数字研究为理论提供了进一步的支持,并验证了各种模型的竞争性性性表现。在一系列模型中,已经开发的R系列测试软件包件中实施了严格的算法。