We propose a data segmentation methodology for the high-dimensional linear regression problem where the regression parameters are allowed to undergo multiple changes. The proposed methodology, MOSEG, proceeds in two stages where the data is first scanned for multiple change points using a moving window-based procedure, which is followed by a location refinement stage. MOSEG enjoys computational efficiency thanks to the adoption of a coarse grid in the first stage, as well as achieving theoretical consistency in estimating both the total number and the locations of the change points without requiring independence or sub-Gaussianity. In particular, it nearly matches minimax optimal rates when Gaussianity is assumed. We also propose MOSEG.MS, a multiscale extension of MOSEG which, while comparable to MOSEG in terms of computational complexity, achieves theoretical consistency for a broader parameter space that permits multiscale change points. We demonstrate good performance of the proposed methods in comparative simulation studies and also in applications to climate science and economic datasets.
翻译:我们为高维线性回归问题提出了一个数据分离方法,允许回归参数发生多重变化。拟议方法MOSEG分为两个阶段,即数据首先通过移动窗口程序对多个变化点进行扫描,然后采用移动窗口程序,然后是定位改进阶段。MOSEG在第一阶段采用粗网格,在估算变化点的总数和位置方面实现理论一致性,而无需独立或亚加西尼特。特别是,在假设高萨性时,该方法几乎与最小最佳率相匹配。我们还提议MOSEG.MS,即MOSEG的多尺度扩展,在计算复杂性方面与MOSEG可相比,在理论上实现更大参数空间的一致,允许多尺度变化点。我们展示了在比较模拟研究以及气候科学和经济数据集应用方面拟议方法的良好表现。