To mitigate the impact of the pandemic, several measures include lockdowns, rapid vaccination programs, school closures, and economic stimulus. These interventions can have positive or unintended negative consequences. Current research to model and determine an optimal intervention automatically through round-tripping is limited by the simulation objectives, scale (a few thousand individuals), model types that are not suited for intervention studies, and the number of intervention strategies they can explore (discrete vs continuous). We address these challenges using a Deep Deterministic Policy Gradient (DDPG) based policy optimization framework on a large-scale (100,000 individual) epidemiological agent-based simulation where we perform multi-objective optimization. We determine the optimal policy for lockdown and vaccination in a minimalist age-stratified multi-vaccine scenario with a basic simulation for economic activity. With no lockdown and vaccination (mid-age and elderly), results show optimal economy (individuals below the poverty line) with balanced health objectives (infection, and hospitalization). An in-depth simulation is needed to further validate our results and open-source our framework.
翻译:为了减轻疫情带来的影响,采取了多种措施,包括封锁、快速疫苗接种、学校关闭和经济刺激。这些干预措施可能会产生积极或意外的负面影响。目前,通过往返自动建模和确定最佳干预措施的相关研究受到了限制,主要是因为他们面临的仿真目标、规模(几千个个体)、不适合干预研究的模型类型以及他们可以探索的干预策略数量(离散与连续)等方面的限制。因此我们使用一个基于深度确定性策略梯度(DDPG)的策略优化框架,在一个大规模(100,000个个体)的流行病学代理模拟中进行多目标优化,确定了封锁和疫苗接种的最佳政策。本文对年龄分层多种疫苗方案的模拟进行了研究,模拟也考虑了经济活动。在没有封锁和疫苗接种的情况下(适用于中年人和老年人),结果显示经济效益最佳(处于贫困线以下的个体最少),但感染和住院方面的健康目标达到了平衡。需要进一步验证我们的结果和开源我们的框架。