Hydroclimatic time series analysis focuses on a few feature types (e.g., autocorrelations, trends, extremes), which describe a small portion of the entire information content of the observations. Aiming to exploit a larger part of the available information and, thus, to deliver more reliable results (e.g., in hydroclimatic time series clustering contexts), here we approach hydroclimatic time series analysis differently, i.e., by performing massive feature extraction. In this respect, we develop a big data framework for hydroclimatic variable behaviour characterization. This framework relies on approximately 60 diverse features and is completely automatic (in the sense that it does not depend on the hydroclimatic process at hand). We apply the new framework to characterize mean monthly temperature, total monthly precipitation and mean monthly river flow. The applications are conducted at the global scale by exploiting 40-year-long time series originating from over 13 000 stations. We extract interpretable knowledge on seasonality, trends, autocorrelation, long-range dependence and entropy, and on feature types that are met less frequently. We further compare the examined hydroclimatic variable types in terms of this knowledge and, identify patterns related to the spatial variability of the features. For this latter purpose, we also propose and exploit a hydroclimatic time series clustering methodology. This new methodology is based on Breiman's random forests. The descriptive and exploratory insights gained by the global-scale applications prove the usefulness of the adopted feature compilation in hydroclimatic contexts. Moreover, the spatially coherent patterns characterizing the clusters delivered by the new methodology build confidence in its future exploitation...
翻译:水文时间序列分析侧重于少数特征类型(例如,从水力学角度分析、趋势、极端),这些特征描述了观测的全部信息内容中的一小部分。为了利用现有信息中的更多部分,从而提供更可靠的结果(例如,在水文气候时间序列群集背景下),我们在这里对水文气候时间序列分析采取不同的做法,即进行大规模特征提取。在这方面,我们为水文气候变异行为定性开发了一个大数据框架。这一框架依靠大约60种不同的特征,而且完全自动(也就是说,它并不依赖于手头的水文气候过程)。我们采用新的框架来描述每月温度、月降水总量和每月平均河流流流的特征。这些应用是在全球规模上进行的,利用源于13 000多个站的40年时间序列进行。我们从季节性、趋势、自动关系、远程依赖和气流学角度上获取可解释的知识。这一框架依赖于大约60种不同的特征,而且是完全自动的(也就是说,这一框架并不依赖于现有水文气候气候气候变异性进程)。我们进一步比较了该水文变异性模型的模型和后期的利用方法。我们用新的水文变变式方法,用新的水文变变式方法来分析了这种变式方法。我们用新的水文变式将这种变式的变式方法,用新的水文变式的方法通过不断获得的变式的方法,用新的变式方法来分析了这种变式的变式的变式的变式方法,用新的和变式的方法,用新的水文变式方法来分析了。