Predictions of hydrologic variables across the entire water cycle have significant value for water resource management as well as downstream applications such as ecosystem and water quality modeling. Recently, purely data-driven deep learning models like long short-term memory (LSTM) showed seemingly-insurmountable performance in modeling rainfall-runoff and other geoscientific variables, yet they cannot predict untrained physical variables and remain challenging to interpret. Here we show that differentiable, learnable, process-based models (called {\delta} models here) can approach the performance level of LSTM for the intensively-observed variable (streamflow) with regionalized parameterization. We use a simple hydrologic model HBV as the backbone and use embedded neural networks, which can only be trained in a differentiable programming framework, to parameterize, enhance, or replace the process-based model modules. Without using an ensemble or post-processor, {\delta} models can obtain a median Nash Sutcliffe efficiency of 0.732 for 671 basins across the USA for the Daymet forcing dataset, compared to 0.748 from a state-of-the-art LSTM model with the same setup. For another forcing dataset, the difference is even smaller: 0.715 vs. 0.722. Meanwhile, the resulting learnable process-based models can output a full set of untrained variables, e.g., soil and groundwater storage, snowpack, evapotranspiration, and baseflow, and later be constrained by their observations. Both simulated evapotranspiration and fraction of discharge from baseflow agreed decently with alternative estimates. The general framework can work with models with various process complexity and opens up the path for learning physics from big data.
翻译:对整个水循环的水文变量的预测对于水资源管理以及生态系统和水质建模等下游应用具有重大价值。 最近,纯数据驱动的深度学习模型,如长期短期内存(LSTM),在模拟降雨流和其他地球科学变量时显示表面上看似不可逾越的性能,但它们无法预测未经训练的物理变量,并且仍然难以解释。 我们在这里显示,不同、可学习的、基于流程的模型(在这里称为 delta} 模型) 能够从区域化参数化的密集观测变量(流流)接近LSTM的性能水平。 我们使用简单的水文模型HBV作为主干线和使用嵌入的神经网络,这些模型只能在不同的编程框架中训练,以参数化、增强或取代基于流程的模型模型。 使用大元或后处理器,基于 delta} 模型可以获得一个中位的 Nash Sutlifliffe 。 对于美国各地的671个盆地, 732 的中位流流流流流, 和慢流流流的流流, 和慢流流的流的流 流 流 驱动驱动驱动数据模型可以使用另一个数据模型, 驱动的模型可以实现。