This paper uses the weather forecasting as an application background to illustrate the technique of \textit{deep uncertainty learning} (DUL). Weather forecasting has great significance throughout human history and is traditionally approached through numerical weather prediction (NWP) in which the atmosphere is modelled as differential equations. However, due to the instability of these differential equations in the presence of uncertainties, weather forecasting through numerical simulations may not be reliable. This paper explores weather forecasting as a data mining problem. We build a deep prediction interval (DPI) model based on sequence-to-sequence (seq2seq) that predicts spatio-temporal patterns of meteorological variables in the future 37 hours, which incorporates the informative knowledge of NWP. A big contribution and surprising finding in the training process of DPI is that training by mean variance error (MVE) loss instead of mean square error loss can significantly improve the generalization of point estimation, which has never been reported in previous researches. We think this phenomenon can be regarded as a new kind of regularization which can not only be on a par with the famous Dropout but also provide more uncertainty information, and hence comes into win-win situation. Based on single DPI, we then build deep ensemble. We evaluate our method on dataset from 10 realistic weather stations in Beijing of China. Experimental results shown DPI has better generalization than traditional point estimation and deep ensemble can further improve the performance. The deep ensemble method also achieved top-2 online score ranking in the competition of AI Challenger 2018. It can dramatically decrease up to 56\% error compared with NWP.
翻译:本文使用天气预报作为应用背景来说明 extlimit{ deep complicate learning} (DUL) 的技术。 天气预报在整个人类历史上具有重大意义,传统上是通过数字天气预测(NWP) 进行,其中大气模拟为差异方程式。 但是,由于这些差异方程式在存在不确定性的情况下不稳定,通过数字模拟进行的天气预报可能不可靠。 本文将天气预报作为一种数据挖掘问题来探讨。 我们根据从顺序到顺序(seq2seq)建立深度预测间隔(DPI)模型,预测未来37小时的气象变异的时序模式,其中包括NWP的知情知识。 在新闻部的培训过程中,一个巨大的贡献和令人惊讶的发现是,通过平均差异差差损耗而不是中方差损失来进行的培训,可以大大改善点估计的概括化,而以前的研究从未报告过这一点。 我们认为,这种现象可以被视为一种新型的标准化模式,不仅与著名的降幅相当,而且还能提供更多的不确定性信息,因此,从深度的网上变变变变变的方法中,因此,在新闻部的培训过程中,我们所展示了10 的深度的实验性结果。