Childhood obesity is a major public health challenge. Early prediction and identification of the children at a high risk of developing childhood obesity may help in engaging earlier and more effective interventions to prevent and manage obesity. Most existing predictive tools for childhood obesity primarily rely on traditional regression-type methods using only a few hand-picked features and without exploiting longitudinal patterns of children data. Deep learning methods allow the use of high-dimensional longitudinal datasets. In this paper, we present a deep learning model designed for predicting future obesity patterns from generally available items on children medical history. To do this, we use a large unaugmented electronic health records dataset from a large pediatric health system. We adopt a general LSTM network architecture which are known to better represent the longitudinal data. We train our proposed model on both dynamic and static EHR data. Our model is used to predict obesity for ages between 2-20 years. We compared the performance of our LSTM model with other machine learning methods that aggregate over sequential data and ignore temporality. To add interpretability, we have additionally included an attention layer to calculate the attention scores for the timestamps and rank features of each timestamp.
翻译:儿童肥胖症是一个重大的公共卫生挑战。早期预测和识别儿童肥胖症高风险儿童可能有助于早期和更有效地采取预防和管理肥胖症的干预措施。大多数儿童肥胖症现有预测工具主要依靠传统回归型方法,仅使用几个手选特点,不利用儿童数据的纵向模式。深层次学习方法允许使用高维长纵向数据集。在本文中,我们提出了一个深层次学习模型,目的是从儿童医疗史上一般可得的物品中预测未来肥胖症模式。为此,我们使用一个大型儿科保健系统的大型未经强化的电子健康记录数据集。我们采用了一般的LSTM网络结构,已知该结构可以更好地代表纵向数据。我们用动态和静态的EHR数据来培训我们提议的模型。我们使用的模型用来预测2-20岁的肥胖症。我们比较了我们的LSTM模型与其他机器学习方法的性能,这些模型综合了连续数据,忽略了时间性。为了增加可解释性,我们增加了一个关注层,以计算每个时标的时标和级特征的注意度。