Weather forecasting is one of the cornerstones of meteorological work. In this paper, we present a new benchmark dataset named Weather2K, which aims to make up for the deficiencies of existing weather forecasting datasets in terms of real-time, reliability, and diversity, as well as the key bottleneck of data quality. To be specific, our Weather2K is featured from the following aspects: 1) Reliable and real-time data. The data is hourly collected from 2,130 ground weather stations covering an area of 6 million square kilo- meters. 2) Multivariate meteorological variables. 20 meteorological factors and 3 constants for position information are provided with a length of 40,896 time steps. 3) Applicable to diverse tasks. We conduct a set of baseline tests on time series forecasting and spatio-temporal forecasting. To the best of our knowledge, our Weather2K is the first attempt to tackle weather forecasting task by taking full advantage of the strengths of observation data from ground weather stations. Based on Weather2K, we further propose Meteorological Factors based Multi-Graph Convolution Network (MFMGCN), which can effectively construct the intrinsic correlation among geographic locations based on meteorological factors. Sufficient experiments show that MFMGCN improves both the forecasting performance and temporal robustness. We hope our Weather2K can significantly motivate researchers to develop efficient and accurate algorithms to advance the task of weather forecasting. The dataset can be available at https://github.com/bycnfz/weather2k/.
翻译:气象预报是气象工作的基石之一。在本文中,我们展示了一个新的基准数据集,名为气象2K,旨在弥补现有气象预报数据集在实时、可靠性和多样性方面的缺陷,以及数据质量的关键瓶颈。具体地说,我们的气象2K来自以下几个方面:1)可靠和实时数据。数据来自覆盖600万平方米面积的2 130个地面气象站,每小时收集一次数据。2 多变式气象变量。20个气象因素和3个定位信息常数的长度为40 892个步骤。3)适用于不同任务。我们进行了一系列关于时间序列预报和空间时空预报的基准测试。根据我们的知识,我们的气象2K是充分利用地面气象站观测数据优势处理天气预报任务的第一个尝试。基于天气2K,我们进一步提议基于多格拉夫通信网络的气象因素,可以有效地在地理地点之间构建准确的天气预报/动态的内在联系。我们要根据可靠的气象预测和动态的预测结果,我们进行一系列基线测试。