Epidemic is a rapid and wide spread of infectious disease threatening many lives and economy damages. It is important to fore-tell the epidemic lifetime so to decide on timely and remedic actions. These measures include closing borders, schools, suspending community services and commuters. Resuming such curfews depends on the momentum of the outbreak and its rate of decay. Being able to accurately forecast the fate of an epidemic is an extremely important but difficult task. Due to limited knowledge of the novel disease, the high uncertainty involved and the complex societal-political factors that influence the widespread of the new virus, any forecast is anything but reliable. Another factor is the insufficient amount of available data. Data samples are often scarce when an epidemic just started. With only few training samples on hand, finding a forecasting model which offers forecast at the best efforts is a big challenge in machine learning. In the past, three popular methods have been proposed, they include 1) augmenting the existing little data, 2) using a panel selection to pick the best forecasting model from several models, and 3) fine-tuning the parameters of an individual forecastingmodel for the highest possible accuracy. In this paper, a methodology that embraces these three virtues of data mining from a small dataset is proposed...
翻译:传染病的迅速和广泛蔓延威胁着许多生命和经济破坏,必须预先确定流行病的一生,以便决定及时采取补救行动。这些措施包括关闭边界、学校、中止社区服务和上下班人员。恢复宵禁取决于爆发的动力及其衰败率。恢复宵禁取决于爆发的速度和衰败率。能够准确预测流行病的命运是一项极为重要但艰巨的任务。由于对新疾病了解有限、所涉及的高度不确定性和影响新病毒蔓延的复杂社会政治因素,任何预测都是可靠的。另一个因素是现有数据数量不足。当流行病刚刚开始时,数据样本往往很少。由于手头只有很少的培训样本,找到一种预测最努力预测的预测模型是机器学习的一大挑战。在过去,提出了三种流行的方法,其中包括:(1) 增加现有的微小数据,(2) 利用小组选择从几个模型中选取出最佳预测模型,(3) 微调个人预测模型的参数,以达到尽可能高的准确性。在本文中,一种方法是利用这些小的数据的三大优点。