关于下一代储层计算中数值不稳定性的涌现 (On the emergence of numerical instabilities in Next Generation Reservoir Computing)

Next Generation Reservoir Computing (NGRC) is a low-cost machine learning method for forecasting chaotic time series from data. Computational efficiency is crucial for scalable reservoir computing, requiring better strategies to reduce training cost. In this work, we uncover a connection between the numerical conditioning of the NGRC feature matrix -- formed by polynomial evaluations on time-delay coordinates -- and the long-term NGRC dynamics. We show that NGRC can be trained without regularization, reducing computational time. Our contributions are twofold. First, merging tools from numerical linear algebra and ergodic theory of dynamical systems, we systematically study how the feature matrix conditioning varies across hyperparameters. We demonstrate that the NGRC feature matrix tends to be ill-conditioned for short time lags, high-degree polynomials, and short length of training data. Second, we evaluate the impact of different numerical algorithms (Cholesky, singular value decomposition (SVD), and lower-upper (LU) decomposition) for solving the regularized least-squares problem. Our results reveal that SVD-based training achieves accurate forecasts without regularization, being preferable when compared against the other algorithms.

翻译：下一代储层计算（NGRC）是一种用于从数据中预测混沌时间序列的低成本机器学习方法。计算效率对于可扩展的储层计算至关重要，需要更好的策略来降低训练成本。在这项工作中，我们揭示了NGRC特征矩阵（由时滞坐标上的多项式求值构成）的数值条件数与长期NGRC动力学之间的联系。我们证明NGRC可以在无需正则化的情况下进行训练，从而减少计算时间。我们的贡献有两方面。首先，结合数值线性代数与动力系统遍历理论的工具，我们系统研究了特征矩阵条件数如何随超参数变化。我们证明，对于短时滞、高次多项式及短训练数据长度，NGRC特征矩阵往往呈现病态。其次，我们评估了不同数值算法（Cholesky分解、奇异值分解（SVD）和LU分解）在求解正则化最小二乘问题中的影响。我们的结果表明，基于SVD的训练无需正则化即可实现精确预测，相较于其他算法更具优势。