Prediction of physiologic states are important in medical practice because interventions are guided by predicted impacts of interventions. But prediction is difficult in medicine because the generating system is complex and difficult to understand from data alone, and the data are sparse relative to the complexity of the generating processes due to human costs of data collection. Computational machinery can potentially make prediction more accurate, but, working within the constraints of realistic clinical data makes robust inference difficult because the data are sparse, noisy and nonstationary. This paper focuses on prediction given sparse, non-stationary, electronic health record data in the intensive care unit (ICU) using data assimilation, a broad collection of methods that pairs mechanistic models with inference machinery such as the Kalman filter. We find that to make inference with sparse clinical data accurate and robust requires advancements beyond standard DA methods combined with additional machine learning methods. Specifically, we show that combining the newly developed constrained ensemble Kalman filter with machine learning methods can produce substantial gains in robustness and accuracy while minimizing the data requirements. We also identify limitations of Kalman filtering methods that lead to new problems to be overcome to make inference feasible in clinical settings using realistic clinical data.
翻译:暂无翻译