A promising approach for scalable Gausian processes (GPs) is the Karhunen-Lo\`eve (KL) decomposition, in which the GP kernel is represented by a set of basis functions which are the eigenfunctions of the kernel operator. Such decomposed kernels have the potential to be very fast, and do not depend on the selection of a reduced set of inducing points. However KL decompositions lead to high dimensionality, and variable selection becomes paramount. This paper reports a new method of forward variable selection, enabled by the ordered nature of the basis functions in the KL expansion of the Bayesian Smoothing Spline ANOVA kernel (BSS-ANOVA), coupled with fast Gibbs sampling in a fully Bayesian approach. It quickly and effectively limits the number of terms, yielding a method with competitive accuracies, training and inference times for tabular datasets of low feature set dimensionality. The inference speed and accuracy makes the method especially useful for dynamic systems identification, by modeling the dynamics in the tangent space as a static problem, then integrating the learned dynamics using a high-order scheme. The methods are demonstrated on two dynamic datasets: a `Susceptible, Infected, Recovered' (SIR) toy problem, with the transmissibility used as forcing function, along with the experimental `Cascaded Tanks' benchmark dataset. Comparisons on the static prediction of time derivatives are made with a random forest (RF), a residual neural network (ResNet), and the Orthogonal Additive Kernel (OAK) inducing points scalable GP, while for the timeseries prediction comparisons are made with LSTM and GRU recurrent neural networks (RNNs) along with a number of basis set / optimizer combinations within the SINDy package.
翻译:可伸缩高斯进程(GPs)有希望的方法是 Karhunen- Lo ⁇ ⁇ éeve (KL) 分解, GP内核由一组基础函数代表,这些函数是内核操作员的机能。 这种分解的内核具有非常快的潜力, 并不取决于选择一组降低的引力点。 然而, KL 分解会导致高维度, 变量选择变得至关重要。 本文报告了一种前变变量选择的新方法, 由Bayesian Splain ANOVA 内核( BS- ANOVA) 的KLL 基函数扩展中基础函数的定序性质所促成, 一组基础函数代表一系列基础函数, 它们是内核流操作操作员。 快速而有效地限制条件数量, 产生一套具有竞争性的导力、 培训和推导力的系统( RVic) 的元分立值组合( 推算速度和精确度使该方法对动态系统识别特别有用, 以ByeS 的基值为模型, 在轨迹变动的轨中, 数据中, 使用一个已显示的机的机变压的机变压的机的机能的机数据, 。