Smoothing splines are standard methods of nonparametric regression for obtaining smooth functions from noisy observations. But as splines are twice differentiable by construction, they cannot capture potential discontinuities in the underlying signal. The smoothing spline model can be augmented such that discontinuities at a priori unknown locations are incorporated. The augmented model results in a minimization problem with respect to discontinuity locations. The minimizing solution is a cubic smoothing spline with discontinuities (CSSD) which may serve as function estimator for discontinuous signals, as a changepoint detector, and as a tool for exploratory data analysis. However, these applications are hardly accessible at the moment because there is no efficient algorithm for computing a CSSD. In this paper, we propose an efficient algorithm that computes a global minimizer of the underlying problem. Its worst case complexity is quadratic in the number of data points. If the number of detected discontinuities scales linearly with the signal length, we observe linear growth of the runtime. By the proposed algorithm, a CSSD can be computed in reasonable time on standard hardware. Furthermore, we implement a strategy for automatic selection of the hyperparameters. Numerical examples demonstrate the applicability of a CSSD for the tasks mentioned above.
翻译:平滑的滑动样条是从噪音观测中获得平滑功能的非参数回归标准方法。 但是, 平滑的滑动样条通过构造可以两次区别, 它们无法捕捉基底信号中的潜在不连续性。 平滑的滑动样条模式可以扩大, 以便吸收先天未知地点的不连续性。 增强的模型可以最小化不连续地点的问题。 最小化的解答是带有不连续信号的立方平滑样条条( CSSD), 可以作为不连续信号的函数估计器( CSSD), 作为变更点探测器, 和探索性数据分析的工具。 但是, 这些应用程序目前很难被利用, 因为这些应用程序没有计算 CSSD 的高效算法。 在本文中, 我们提出了一个高效的算法, 以计算出根本问题的全球最小化器。 最差的情况复杂度是数据点数的四分位数。 如果检测到的不连续线度标度, 我们观察运行时间的线性增长。 在提议的算法中, 可以在合理的时间里计算 CSSD 示例。 此外, 我们执行上面提到的自动选择的 CSDISD 。 。