Given a times series ${\bf Y}$ in $\mathbb{R}^n$, with a piece-wise contant mean and independent components, the twin problems of change-point detection and change-point localization respectively amount to detecting the existence of times where the mean varies and estimating the positions of those change-points. In this work, we tightly characterize optimal rates for both problems and uncover the phase transition phenomenon from a global testing problem to a local estimation problem. Introducing a suitable definition of the energy of a change-point, we first establish in the single change-point setting that the optimal detection threshold is $\sqrt{2\log\log(n)}$. When the energy is just above the detection threshold, then the problem of localizing the change-point becomes purely parametric: it only depends on the difference in means and not on the position of the change-point anymore. Interestingly, for most change-point positions, it is possible to detect and localize them at a much smaller energy level. In the multiple change-point setting, we establish the energy detection threshold and show similarly that the optimal localization error of a specific change-point becomes purely parametric. Along the way, tight optimal rates for Hausdorff and $l_1$ estimation losses of the vector of all change-points positions are also established. Two procedures achieving these optimal rates are introduced. The first one is a least-squares estimator with a new multiscale penalty that favours well spread change-points. The second one is a two-step multiscale post-processing procedure whose computational complexity can be as low as $O(n\log(n))$. Notably, these two procedures accommodate with the presence of possibly many low-energy and therefore undetectable change-points and are still able to detect and localize high-energy change-points even with the presence of those nuisance parameters.
翻译:以 $mathbb{R ⁇ n} 以 $mathbb{R ⁇ n$ 为时序, 以 $mathbb{R ⁇ n$ 为单位, 以整片相近且独立的中间值表示, 改变点检测和改变点定位的双重问题分别相当于检测出存在平均值差异的时间, 并估计这些变化点的位置。 在这项工作中, 我们严格确定这两个问题的最佳比率, 并发现从全球测试问题到局部估算问题的阶段过渡现象。 引入一个合适的变化点能量定义, 我们首先在单一变化点设定最佳检测阈值为$sqrt{2\log\log\log\log}$。 当能量刚刚超过检测阈值时, 改变点的本地化问题就变成了纯粹的隐性偏差: 它只取决于手段上的差异, 而不再取决于变化点的位置。 有趣的是, 对于大多数改变点的位置, 仍然有可能在低的能源水平上检测和本地化。 在多个变化点设置中, 我们设定能源检测阈值的起始点, 并且同样显示 最接近 美元 最优的市值的市值的市值的汇率是一次的汇率变化, 因此, 最接近的市值的市值的市值的市值的市值的市值的市值的市值的市值的汇率是 。