We assume a nonparametric regression model with signals given by the sum of a piecewise constant function and a smooth function. To detect the change-points and estimate the regression functions, we propose PCpluS, a combination of the fused Lasso and kernel smoothing. In contrast to existing approaches, it explicitly uses the assumption that the signal can be decomposed into a piecewise constant and a smooth function when detecting change-points. This is motivated by several applications and by theoretical results about partial linear model. Tuning parameters are selected by cross-validation. We argue that in this setting minimizing the L1-loss is superior to minimizing the L2-loss. We also highlight important consequences for cross-validation in piecewise constant change-point regression. Simulations demonstrate that our approach has a small average mean square error and detects change-points well, and we apply the methodology to genome sequencing data to detect copy number variations. Finally, we demonstrate its flexibility by combining it with smoothing splines and by proposing extensions to multivariate and filtered data.
翻译:我们假设一个非参数回归模型, 其信号由元素常量函数和平滑函数的总和给出。 为了检测变化点和估计回归函数, 我们提议 PCplusS, 将引信Lasso 和内核滑动结合起来。 与现有的方法相反, 它明确使用这样的假设, 即信号在检测变化点时可以分解成片点常量和平稳函数。 这是由几个应用程序和部分线性模型的理论结果驱动的。 调试参数是通过交叉校验选择的。 我们认为, 在此设置中, 将L1损失最小化优于将L2损失最小化。 我们还强调了在元素常量变化点回归中交叉校验的重要后果。 模拟显示, 我们的方法存在一个小的平均正方差, 并很好地检测变化点。 我们运用基因组测序数据的方法来检测复制数字的变化。 最后, 我们通过将它与平滑线和提议扩展多变量和过滤数据来显示其灵活性 。