The effectiveness of resource allocation under emergencies especially hurricane disasters is crucial. However, most researchers focus on emergency resource allocation in a ground transportation system. In this paper, we propose Learning-to-Dispatch (L2D), a reinforcement learning (RL) based air route dispatching system, that aims to add additional flights for hurricane evacuation while minimizing the airspace's complexity and air traffic controller's workload. Given a bipartite graph with weights that are learned from the historical flight data using RL in consideration of short- and long-term gains, we formulate the flight dispatch as an online maximum weight matching problem. Different from the conventional order dispatch problem, there is no actual or estimated index that can evaluate how the additional evacuation flights influence the air traffic complexity. Then we propose a multivariate reward function in the learning phase and compare it with other univariate reward designs to show its superior performance. The experiments using the real-world dataset for Hurricane Irma demonstrate the efficacy and efficiency of our proposed schema.
翻译:在紧急情况下,特别是在飓风灾害下,资源分配的有效性至关重要。然而,大多数研究人员侧重于地面运输系统中的紧急资源分配。在本文件中,我们提议采用基于强化学习的航空航线调度系统(L2D),即基于强化学习的航空航线调度系统,目的是增加飓风后撤的航班,同时尽量减少空气空间的复杂性和空中交通管制员的工作量。考虑到短期和长期收益,我们把飞行发送作为在线最大重量匹配问题。与常规订单发送问题不同,我们没有实际或估计指数可以评估其他撤离航班如何影响空中交通的复杂性。然后,我们提议在学习阶段设立一个多变奖励功能,并将其与其他非航行奖赏计划进行比较,以显示其优异性。利用“Irma”飓风真实世界数据集进行的实验展示了我们拟议计划的效力和效率。