In the last decade, much work in atmospheric science has focused on spatial verification (SV) methods for gridded prediction, which overcome serious disadvantages of pixelwise verification. However, neural networks (NN) in atmospheric science are almost always trained to optimize pixelwise loss functions, even when ultimately assessed with SV methods. This establishes a disconnect between model verification during vs. after training. To address this issue, we develop spatially enhanced loss functions (SELF) and demonstrate their use for a real-world problem: predicting the occurrence of thunderstorms (henceforth, "convection") with NNs. In each SELF we use either a neighbourhood filter, which highlights convection at scales larger than a threshold, or a spectral filter (employing Fourier or wavelet decomposition), which is more flexible and highlights convection at scales between two thresholds. We use these filters to spatially enhance common verification scores, such as the Brier score. We train each NN with a different SELF and compare their performance at many scales of convection, from discrete storm cells to tropical cyclones. Among our many findings are that (a) for a low (high) risk threshold, the ideal SELF focuses on small (large) scales; (b) models trained with a pixelwise loss function perform surprisingly well; (c) however, models trained with a spectral filter produce much better-calibrated probabilities than a pixelwise model. We provide a general guide to using SELFs, including technical challenges and the final Python code, as well as demonstrating their use for the convection problem. To our knowledge this is the most in-depth guide to SELFs in the geosciences.
翻译:在过去十年中,大气科学的大量工作侧重于空间核查(SV)方法,用于电网化预测,克服了像素核查的严重缺点。然而,大气科学的神经网络几乎总是受过训练,以优化像素损失功能,即使最终用SV方法进行评估。这在模型核查与培训之后的对比中造成了脱节。为了解决这个问题,我们开发了空间增强损失功能(SELF),并展示了这些功能用于现实世界问题:预测雷暴(Ceforth,“凝固”)的发生情况,这些方法克服了象素核查的严重缺点。在每一个SELF中,我们使用一个邻里缘过滤器,该过滤器突出比阈值大得多的比重,或者光谱过滤器(调四倍或波盘脱混),这在两个阈值之间更灵活和突出调高。我们用这些过滤器在空间上提升共同的核查分数,比如布里尔里氏分。我们用不同的SELF指南来比较它们从离层风暴细胞到更深层的级的运行情况。我们用了一个更深层的高级的模型,我们用了一个更精细的SELFLF值的模型来计算。我们用了一个更精细的模型来分析模型来分析一个更精确的模型,用来做一个更精确的模型,用来做一个更精确的模型。