Epidemiologists model the dynamics of epidemics in order to propose control strategies based on pharmaceutical and non-pharmaceutical interventions (contact limitation, lock down, vaccination, etc). Hand-designing such strategies is not trivial because of the number of possible interventions and the difficulty to predict long-term effects. This task can be cast as an optimization problem where state-of-the-art machine learning algorithms such as deep reinforcement learning, might bring significant value. However, the specificity of each domain -- epidemic modelling or solving optimization problem -- requires strong collaborations between researchers from different fields of expertise. This is why we introduce EpidemiOptim, a Python toolbox that facilitates collaborations between researchers in epidemiology and optimization. EpidemiOptim turns epidemiological models and cost functions into optimization problems via a standard interface commonly used by optimization practitioners (OpenAI Gym). Reinforcement learning algorithms based on Q-Learning with deep neural networks (DQN) and evolutionary algorithms (NSGA-II) are already implemented. We illustrate the use of EpidemiOptim to find optimal policies for dynamical on-off lock-down control under the optimization of death toll and economic recess using a Susceptible-Exposed-Infectious-Removed (SEIR) model for COVID-19. Using EpidemiOptim and its interactive visualization platform in Jupyter notebooks, epidemiologists, optimization practitioners and others (e.g. economists) can easily compare epidemiological models, costs functions and optimization algorithms to address important choices to be made by health decision-makers.
翻译:流行病学家模拟流行病的动态,以便提出基于制药和非制药干预(接触限制、锁定、接种等)的控制战略。手设计这种战略并非微不足道,因为可能采取的干预措施数量众多,而且难以预测长期影响。这项任务可以作为一个优化问题,因为最先进的机器学习算法,如深层强化学习,可能会带来重大价值。然而,每个领域的特殊性 -- -- 流行病建模或解决优化问题 -- -- 需要不同专业领域的研究人员进行简单的合作。这就是为什么我们采用Epide-Optim,这是一个促进流行病学和优化研究人员之间合作的Python工具箱。Epentememipim将流行病学模型和成本功能转化为优化问题,通过优化执行者通常使用的标准界面(OpenAI Gym),在与深层神经网络(DQQN)和进化算算法(NSGA-II)的基础上加强学习算法。我们用Emi地Opitimimi-remialalalal-aliceal-deal-deal-deal-dealimal-deal-dealimal-deal-deal-deal-deal-deal-deal-deal-deal-deal-de-de-deal-de-de-deal-de-deal-deal-de-de-deal-de-deal-deal-de-de-de-de-de-deal-deal-de-de-de-de-de-de-de-de-de-de-de-de-de-deal-deal-deal-de-de-de-de-de-deal-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-de-