We propose Narrowest Significance Pursuit (NSP), a general and flexible methodology for automatically detecting localised regions in data sequences which each must contain a change-point, at a prescribed global significance level. Here, change-points are understood as abrupt changes in the parameters of an underlying linear model. NSP works by fitting the postulated linear model over many regions of the data, using a certain multiresolution sup-norm loss, and identifying the shortest interval on which the linearity is significantly violated. The procedure then continues recursively to the left and to the right until no further intervals of significance can be found. The use of the multiresolution sup-norm loss is a key feature of NSP, as it enables the transfer of significance considerations to the domain of the unobserved true residuals, a substantial simplification. It also guarantees important stochastic bounds which directly yield exact desired coverage probabilities, regardless of the form or number of the regressors. NSP works with a wide range of distributional assumptions on the errors, including Gaussian with known or unknown variance, some light-tailed distributions, and some heavy-tailed, possibly heterogeneous distributions via self-normalisation. It also works in the presence of autoregression. The mathematics of NSP is, by construction, uncomplicated, and its key computational component uses simple linear programming. In contrast to the widely studied "post-selection inference" approach, NSP enables the opposite viewpoint and paves the way for the concept of "post-inference selection". Pre-CRAN R code implementing NSP is available at https://github.com/pfryz/nsp.
翻译:我们提出“最狭义的追求” (NSP),这是一个在数据序列中自动检测本地化区域的一般和灵活的方法,每个区域都必须在一定的全球意义水平上包含一个变化点。这里,变化点被理解为一个基本线性模型参数的突变。 NSP 的工作方法是在数据的许多区域安装假设线性模型,使用某种多分辨率的光线性损失,并找出线性明显违反的最短间隔。随后,程序会继续向左和右循环,直到找不到任何进一步的重大间隔。多分辨率上调损失是 NSP 的一个关键特征,因为它能够将重要考虑转移到未观测的真实残余领域,大大简化。它还保证了重要的孔性界限,直接产生准确的覆盖概率,而不论内向后方的形态或数量。 NSP 继续使用关于错误的分布假设范围很广,包括有已知或未知的直线性偏差的面、一些浅尾调的 NSP 调值损失是 NSP 的相反特征, 因为它能够将重要考虑转移到未观测的真实性领域。