Discovering the partial differential equations underlying a spatio-temporal datasets from very limited observations is of paramount interest in many scientific fields. However, it remains an open question to know when model discovery algorithms based on sparse regression can actually recover the underlying physical processes. We trace back the poor of performance of Lasso based model discovery algorithms to its potential variable selection inconsistency: meaning that even if the true model is present in the library, it might not be selected. By first revisiting the irrepresentability condition (IRC) of the Lasso, we gain some insights of when this might occur. We then show that the adaptive Lasso will have more chances of verifying the IRC than the Lasso and propose to integrate it within a deep learning model discovery framework with stability selection and error control. Experimental results show we can recover several nonlinear and chaotic canonical PDEs with a single set of hyperparameters from a very limited number of samples at high noise levels.
翻译:在许多科学领域,从非常有限的观察中发现时空数据组背后的局部差异方程式最为重要。 然而,它仍然是一个有待解决的问题,以了解基于细微回归的模型发现算法何时能够实际恢复基本物理过程。 我们从基于Lasso的模型发现算法的不良性能追溯到它潜在的可变选择不一致:这意味着即使图书馆中存在真实的模型,它也可能不被选中。 通过首先重新审视Lasso的不可调离状态(IRC),我们就能对何时会出现这种情况有所了解。 我们然后显示,适应性激光索比Lasso更有机会核查IRC, 并提议将它纳入一个带有稳定性选择和错误控制的深度学习模型发现框架。 实验结果显示,我们可以从高噪音水平的非常有限的样本中用单一的一组超参数来回收若干非线性和混乱的卡门式PDE。