In this paper, we propose an online algorithm mspace for forecasting node features in temporal graphs, which captures spatial cross-correlation among different nodes as well as the temporal auto-correlation within a node. The algorithm can be used for both probabilistic and deterministic multi-step forecasting, making it applicable for estimation and generation tasks. Comparative evaluations against various baselines, including temporal graph neural network (TGNN) models and classical Kalman filters, demonstrate that mspace performs at par with the state-of-the-art and even surpasses them on some datasets. Importantly, mspace demonstrates consistent performance across datasets with varying training sizes, a notable advantage over TGNN models that require abundant training samples to effectively learn the spatiotemporal trends in the data. Therefore, employing mspace is advantageous in scenarios where the training sample availability is limited. Additionally, we establish theoretical bounds on multi-step forecasting error of mspace and show that it scales linearly with the number of forecast steps $q$ as $\mathcal{O}(q)$. For an asymptotically large number of nodes $n$, and timesteps $T$, the computational complexity of mspace grows linearly with both $n$, and $T$, i.e., $\mathcal{O}(nT)$, while its space complexity remains constant $\mathcal{O}(1)$. We compare the performance of various mspace variants against ten recent TGNN baselines and two classical baselines, ARIMA and the Kalman filter across ten real-world datasets. Additionally, we propose a technique to generate synthetic datasets to aid in evaluating node feature forecasting methods, with the potential to serve as a benchmark for future research. Lastly, we have investigate the interpretability of different mspace variants by analyzing model parameters alongside dataset characteristics to derive model and data-centric insights.
翻译:暂无翻译