A promising approach for scalable Gaussian processes (GPs) is the Karhunen-Lo\`eve (KL) decomposition, in which the GP kernel is represented by a set of basis functions which are the eigenfunctions of the kernel operator. Such decomposed kernels have the potential to be very fast, and do not depend on the selection of a reduced set of inducing points. However KL decompositions lead to high dimensionality, and variable selection becomes paramount. This paper reports a new method of forward variable selection, enabled by the ordered nature of the basis functions in the KL expansion of the Bayesian Smoothing Spline ANOVA kernel (BSS-ANOVA), coupled with fast Gibbs sampling in a fully Bayesian approach. It quickly and effectively limits the number of terms, yielding a method with competitive accuracies, training and inference times for tabular datasets of low feature set dimensionality. The inference speed and accuracy makes the method especially useful for dynamic systems identification, by modeling the dynamics in the tangent space as a static problem, then integrating the learned dynamics using a high-order scheme. The methods are demonstrated on two dynamic datasets: a `Susceptible, Infected, Recovered' (SIR) toy problem, with the transmissibility used as forcing function, along with the experimental `Cascaded Tanks' benchmark dataset. Comparisons on the static prediction of time derivatives are made with a random forest (RF), a residual neural network (ResNet), and the Orthogonal Additive Kernel (OAK) inducing points scalable GP, while for the timeseries prediction comparisons are made with LSTM and GRU recurrent neural networks (RNNs) along with the SINDy package.
翻译:可缩放高斯进程( GPs) 很有希望的方法是 Karhunen- Lo ⁇ ⁇ éeve (KL) 分解, GP内核由一组基础函数代表,这些函数是内核操作员的天体功能。 这种分解的内核具有非常快速的潜力, 并不取决于选择一组降低的诱导点。 但是 KL 分解会导致高维度, 并且选择变量变得至关重要。 本文报告了一种新的前变变量选择方法, 由KL 基础函数的定序性质所驱动, 由Bayesian 平滑的 ANOVA内核( BS- ANOVA) 扩展的基函数代表, 加上全巴伊斯操作员操作员功能的机体功能功能。 这种分解的内核内核内核内核内核, 产生一种具有竞争性的透析、 培训和推导断的系统( RBS) 的表内核元体数据( 变现速度和精确度, 使该方法特别有助于动态系统识别,, 将机的机变动的机变动的机的机变压数据转化为, 使用一个已显示的机的机变压的机的机能数据 。