以数据驱动的地球物理预报:简单、低成本和精确的基准,采用内核方法 (Data-driven geophysical forecasting: Simple, low-cost, and accurate baselines with kernel methods)

Modeling geophysical processes as low-dimensional dynamical systems and regressing their vector field from data is a promising approach for learning emulators of such systems. We show that when the kernel of these emulators is also learned from data (using kernel flows, a variant of cross-validation), then the resulting data-driven models are not only faster than equation-based models but are easier to train than neural networks such as the long short-term memory neural network. In addition, they are also more accurate and predictive than the latter. When trained on geophysical observational data, for example, the weekly averaged global sea-surface temperature, considerable gains are also observed by the proposed technique in comparison to classical partial differential equation-based models in terms of forecast computational cost and accuracy. When trained on publicly available re-analysis data for the daily temperature of the North-American continent, we see significant improvements over classical baselines such as climatology and persistence-based forecast techniques. Although our experiments concern specific examples, the proposed approach is general, and our results support the viability of kernel methods (with learned kernels) for interpretable and computationally efficient geophysical forecasting for a large diversity of processes.

翻译：将地球物理过程建模为低维动态系统,并将其矢量场从数据中倒退,这是学习这些系统模拟器的一个很有希望的方法。我们表明,当这些模拟器的内核也从数据中学习(使用内核流,一个交叉校准的变体),随后产生的数据驱动模型不仅比基于方程的模型更快,而且比长期内存神经网络等神经网络的古典基线(如气候学和耐久性预测技术)培训容易。此外,它们也比后者更准确和预测。在进行地球物理观测数据培训时,例如,每星期平均全球海表温度,拟议的技术在预测计算成本和准确性方面,与传统的部分差异方程式模型相比,也观察到了相当大的进展。当对关于北美大陆每日温度的公开再分析数据进行培训时,我们看到对古典基线(如气候学和耐久性预测技术)有了重大改进。尽管我们关心的具体例子,但拟议的方法是一般性的,而且我们的结果支持了内核观测方法的可行性(有学习过的内核)用于解释和有效地球物理的大规模预测。