The investment of time and resources for better strategies and methodologies to tackle a potential pandemic is key to deal with potential outbreaks of new variants or other viruses in the future. In this work, we recreated the scene of a year ago, 2020, when the pandemic erupted across the world for the fifty countries with more COVID-19 cases reported. We performed some experiments in which we compare state-of-the-art machine learning algorithms, such as LSTM, against online incremental machine learning algorithms to adapt them to the daily changes in the spread of the disease and predict future COVID-19 cases. To compare the methods, we performed three experiments: In the first one, we trained the models using only data from the country we predicted. In the second one, we use data from all fifty countries to train and predict each of them. In the first and second experiment, we used a static hold-out approach for all methods. In the third experiment, we trained the incremental methods sequentially, using a prequential evaluation. This scheme is not suitable for most state-of-the-art machine learning algorithms because they need to be retrained from scratch for every batch of predictions, causing a computational burden. Results show that incremental methods are a promising approach to adapt to changes of the disease over time; they are always up to date with the last state of the data distribution, and they have a significantly lower computational cost than other techniques such as LSTMs.
翻译:投入时间和资源以制定更好的战略和方法来应对潜在的流行病,是应对未来可能爆发新的变异体或其他病毒的关键。 在这项工作中,我们重新创造了一年前即2020年的景象,当时全世界有50个报告的COVID-19案例的50个国家都出现了这种流行病。我们进行了一些实验,比较了先进的机器学习算法,如LSTM等最新机器学习算法,与在线渐进式机器学习算法相比,以适应疾病传播的日常变化,预测未来的COVID-19案例。为了比较方法,我们进行了三项实验:在第一个实验中,我们只用我们预测的国家的数据对模型进行了培训。在第二个实验中,我们使用来自所有50个国家的数据来培训和预测每一个病例。在第一个和第二个实验中,我们对所有方法都采用了静态的暂停法。在第三次实验中,我们用量级前评价来培训渐进式方法,以适应疾病传播的日常变化,这个方法不适合大多数状态式机器学习算法,因为它们需要从每一类国家的数据进行重新整理到每一类测算方法的周期,结果的计算结果的测算结果的最近期成本。 使结果的测算结果的测算法比最后的测算法都比测算法要高。