With a mortality rate of 5.4 million lives worldwide every year and a healthcare cost of more than 16 billion dollars in the USA alone, sepsis is one of the leading causes of hospital mortality and an increasing concern in the ageing western world. Recently, medical and technological advances have helped re-define the illness criteria of this disease, which is otherwise poorly understood by the medical society. Together with the rise of widely accessible Electronic Health Records, the advances in data mining and complex nonlinear algorithms are a promising avenue for the early detection of sepsis. This work contributes to the research effort in the field of automated sepsis detection with an open-access labelling of the medical MIMIC-III data set. Moreover, we propose MGP-AttTCN: a joint multitask Gaussian Process and attention-based deep learning model to early predict the occurrence of sepsis in an interpretable manner. We show that our model outperforms the current state-of-the-art and present evidence that different labelling heuristics lead to discrepancies in task difficulty. For instance, when predicting sepsis five hours prior to onset on our new realistic labels, our proposed model achieves an area under the ROC curve of 0.660 and an area under the PR curve of 0.483, whereas the (less interpretable) previous state-of-the-art model (MGP-TCN) achieves 0.635 AUROC and 0.460 AUPR and the popular commercial InSight model achieves 0.490 AUROC and 0.359 AUPR.
翻译:由于全世界每年的死亡率为540万,而仅美国每年的保健费用就超过160亿美元,败血症是医院死亡的主要原因之一,也是西方老龄化世界日益关切的一个原因。最近,医学和技术进步帮助重新消除了医学社会本来不太了解的这一疾病的疾病标准。随着广泛获得的电子健康记录,数据开采的进步和复杂的非线性算法是早期发现败血症的一个有希望的渠道。这项工作有助于自动检测败血症领域的研究工作,对医疗MIMIMI-III数据集进行公开贴标签。此外,我们提议MGP-AttTCN:一个联合多塔斯克高斯进程和基于关注的深层次学习模型,以便尽早以可解释的方式预测败血症的发生。我们提出的模型超越了目前60种模型和目前的证据,表明不同的大众形象导致任务困难。例如,在开始我们新的0.683和0.283号的模型领域之前预测Sepsion-NRC将达到前5小时,而在0.183号的模型和0.183年的RO-40年的模型下,我们提议的模型将达到一个现实的模型区域。