Flooding is one of the most destructive and costly natural disasters, and climate changes would further increase risks globally. This work presents a novel multimodal machine learning approach for multi-year global flood risk prediction, combining geographical information and historical natural disaster dataset. Our multimodal framework employs state-of-the-art processing techniques to extract embeddings from each data modality, including text-based geographical data and tabular-based time-series data. Experiments demonstrate that a multimodal approach, that is combining text and statistical data, outperforms a single-modality approach. Our most advanced architecture, employing embeddings extracted using transfer learning upon DistilBert model, achieves 75\%-77\% ROCAUC score in predicting the next 1-5 year flooding event in historically flooded locations. This work demonstrates the potentials of using machine learning for long-term planning in natural disaster management.
翻译:洪涝是最具破坏性和代价最大的自然灾害之一,气候变化将进一步增加全球风险。 这项工作为多年全球洪水风险预测提供了一种新型的多式联运机器学习方法,结合地理信息和历史自然灾害数据集。我们的多式联运框架利用最先进的处理技术从每个数据模式中提取嵌入物,包括基于文本的地理数据和基于表格的时间序列数据。实验表明,将文本和统计数据相结合的多式联运方法比单一模式方法要好。我们最先进的结构,利用DistilBert模型的转移学习提取嵌入物,在预测历史上洪水发生地点下一个1-5年的洪水事件方面,实现了75-77 ⁇ RCOCUC在预测下一个1-5年洪水事件方面的得分。这项工作显示了利用机器学习进行自然灾害管理长期规划的潜力。