Ensemble和Bayesian稀少模型发现中不确定性估计的趋同 (Convergence of uncertainty estimates in Ensemble and Bayesian sparse model discovery)

Sparse model identification enables nonlinear dynamical system discovery from data. However, the control of false discoveries for sparse model identification is challenging, especially in the low-data and high-noise limit. In this paper, we perform a theoretical study on ensemble sparse model discovery, which shows empirical success in terms of accuracy and robustness to noise. In particular, we analyse the bootstrapping-based sequential thresholding least-squares estimator. We show that this bootstrapping-based ensembling technique can perform a provably correct variable selection procedure with an exponential convergence rate of the error rate. In addition, we show that the ensemble sparse model discovery method can perform computationally efficient uncertainty estimation, compared to expensive Bayesian uncertainty quantification methods via MCMC. We demonstrate the convergence properties and connection to uncertainty quantification in various numerical studies on synthetic sparse linear regression and sparse model discovery. The experiments on sparse linear regression support that the bootstrapping-based sequential thresholding least-squares method has better performance for sparse variable selection compared to LASSO, thresholding least-squares, and bootstrapping-based LASSO. In the sparse model discovery experiment, we show that the bootstrapping-based sequential thresholding least-squares method can provide valid uncertainty quantification, converging to a delta measure centered around the true value with increased sample sizes. Finally, we highlight the improved robustness to hyperparameter selection under shifting noise and sparsity levels of the bootstrapping-based sequential thresholding least-squares method compared to other sparse regression methods.

翻译：光谱模型识别使非线性动态系统能够从数据中发现非线性动态系统。然而, 控制用于稀有模型识别的错误发现是具有挑战性的, 特别是在低数据和高噪音限值方面。在本文中, 我们对全套性稀少模型发现进行理论研究, 表明在准确性和稳健度与噪音方面的实验性成功。特别是, 我们分析基于靴式的连续下限最小平方值测量仪。我们显示, 基于靴式下限的最小平流组合组合技术可以比基于错误率的指数趋近率, 更准确地纠正可变码选择程序。此外, 我们显示, 混合性低位的模型发现方法可以进行计算性高效的不确定性估计, 与昂贵的贝斯不确定度定量方法相比。我们展示了以精细线性下线性下回归和稀薄模型检测的细线性回归支持, 与基于最差的基底线性最低下限值排序方法相比, 与基于最差的平流性定值排序方法相比, 可以提供最差异的可进行更精确的选取, 最低的底底级的底级的底级测试。