P-spline represents an unknown univariate function with uniform B-splines on equidistant knots and penalizes their coefficients using a simple difference matrix for smoothness. But for non-uniform B-splines on unevenly spaced knots, such difference penalty fails, and the conventional derivative penalty is hitherto the only choice. We proposed a general P-spline estimator to lift this restriction by deriving a general difference penalty for non-uniform B-splines. We also established a sandwich formula between derivative and general difference penalties for a better understanding of their connections. Simulations show that both P-spline variants have close MSE performance in general. But in practice, one can yield a more satisfactory fit than the other. For example, the bone mineral content (BMC) data favor general P-spline, while the fossil shell data favor standard P-spline. We therefore believe both variants to be useful tools for practical modeling. To implement our general P-spline, we developed two new R packages: gps and gps.mgcv. The latter creates a new "gps" smooth class for mgcv, so that a general P-spline can be specified as s(x, bs = "gps") in a model formula and estimated in the framework of generalized additive models.
翻译:P- spline 代表着一个未知的单向线函数, 其在等离线上使用统一的 B- spline, 用简单的差差差矩阵来惩罚它们的系数。 但是, 对于不统一的 B- spline, 这种差差罚失败, 而常规衍生物惩罚至今是唯一的选择 。 我们提议了一个普通 P- spline 估计器, 通过对非统一 B- spline 得出对非统一 B- spline 的一般不同处罚来取消这一限制。 我们还在衍生物和一般差别处罚之间建立了一个三明治公式, 以便更好地了解它们的关联性。 模拟显示 P- spline 两种变体一般都关闭了 MSE 性能 。 但在实践上, 可以产生比其他更令人满意的效果。 例如, 骨质矿含量( BMC) 数据有利于一般 P- spline, 而 化石外壳数据有利于标准 P- spline 。 因此我们认为两种变体都是实用模型的有用工具 。 为了执行我们的一般 P- 的 P- spline, 我们开发了两个新的 R 套件: gps and gggcvvvv prec 。