This paper presents a novel spatio-temporal LSTM (SPATIAL) architecture for time series forecasting applied to environmental datasets. The framework was evaluated across multiple sensors and for three different oceanic variables: current speed, temperature, and dissolved oxygen. Network implementation proceeded in two directions that are nominally separated but connected as part of a natural environmental system -- across the spatial (between individual sensors) and temporal components of the sensor data. Data from four sensors sampling current speed, and eight measuring both temperature and dissolved oxygen evaluated the framework. Results were compared against RF and XGB baseline models that learned on the temporal signal of each sensor independently by extracting the date-time features together with the past history of data using sliding window matrix. Results demonstrated ability to accurately replicate complex signals and provide comparable performance to state-of-the-art benchmarks. Notably, the novel framework provided a simpler pre-processing and training pipeline that handles missing values via a simple masking layer. Enabling learning across the spatial and temporal directions, this paper addresses two fundamental challenges of ML applications to environmental science: 1) data sparsity and the challenges and costs of collecting measurements of environmental conditions such as ocean dynamics, and 2) environmental datasets are inherently connected in the spatial and temporal directions while classical ML approaches only consider one of these directions. Furthermore, sharing of parameters across all input steps makes SPATIAL a fast, scalable, and easily-parameterized forecasting framework.
翻译:本文介绍了用于环境数据集的时间序列预报的新型时空LSTM(SPATIAL)结构(SPATIAL)结构。框架通过多个传感器和三种不同的海洋变量进行了评价:当前速度、温度和溶解氧;网络实施有两个方向,它们名义上分离,但作为自然环境系统的一部分 -- -- 跨越空间(单个传感器)和传感器数据的时间部分 -- -- 跨空间(单个传感器)和传感器数据数据数据数据数据数据的时间样本当前速度和测量温度和溶解氧的8个数据框架。结果与通过提取日期-时间特征独立地从每个传感器的时间信号中学习的RF和XGB基准模型以及使用滑动窗口矩阵的以往数据历史进行了比较。结果显示有能力准确地复制复杂的信号并提供与最新基准的可比性性能。新的框架提供了更简单的处理前和培训管道,通过简单的掩蔽层处理缺失的值。在空间-时间方向上进行扶持学习,本文针对环境科学的ML应用的两个基本挑战:(1)数据空间-时间特征以及测量环境状况的难度和成本,同时考虑各种空间-空间-空间-空间-轨道-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-动态-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-动态-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间-空间