Infectious disease surveillance is of great importance for the prevention of major outbreaks. Syndromic surveillance aims at developing algorithms which can detect outbreaks as early as possible by monitoring data sources which allow to capture the occurrences of a certain disease. Recent research mainly focuses on the surveillance of specific, known diseases, putting the focus on the definition of the disease pattern under surveillance. Until now, only little effort has been devoted to what we call non-specific syndromic surveillance, i.e., the use of all available data for detecting any kind of outbreaks, including infectious diseases which are unknown beforehand. In this work, we revisit published approaches for non-specific syndromic surveillance and present a set of simple statistical modeling techniques which can serve as benchmarks for more elaborate machine learning approaches. Our experimental comparison on established synthetic data and real data in which we injected synthetic outbreaks shows that these benchmarks already achieve very competitive results and often outperform more elaborate algorithms.
翻译:传染性疾病监测对于预防重大疾病爆发非常重要。 合成药物监测旨在开发各种算法,通过监测数据来源来尽早检测爆发,从而能够捕捉某种疾病的发病情况。最近的研究主要侧重于监测特定已知疾病,重点是确定监测中疾病模式的定义。到目前为止,我们很少致力于我们所称的非特定综合药物监测,即利用所有可用数据检测任何类型的爆发,包括事先未知的传染性疾病。在这项工作中,我们重新审视了已公布的非特定综合药物监测方法,并提出了一套简单的统计模型技术,这些技术可以作为更精细的机器学习方法的基准。我们对既定合成数据和实际数据的实验比较,我们通过这些数据注入合成疾病,表明这些基准已经取得了非常有竞争力的结果,而且往往比更精细的算法更完善。