Time series models with recurrent neural networks (RNNs) can have high accuracy but are unfortunately difficult to interpret as a result of feature-interactions, temporal-interactions, and non-linear transformations. Interpretability is important in domains like healthcare where constructing models that provide insight into the relationships they have learned are required to validate and trust model predictions. We want accurate time series models where users can understand the contribution of individual input features. We present the Interpretable-RNN (I-RNN) that balances model complexity and accuracy by forcing the relationship between variables in the model to be additive. Interactions are restricted between hidden states of the RNN and additively combined at the final step. I-RNN specifically captures the unique characteristics of clinical time series, which are unevenly sampled in time, asynchronously acquired, and have missing data. Importantly, the hidden state activations represent feature coefficients that correlate with the prediction target and can be visualized as risk curves that capture the global relationship between individual input features and the outcome. We evaluate the I-RNN model on the Physionet 2012 Challenge dataset to predict in-hospital mortality, and on a real-world clinical decision support task: predicting hemodynamic interventions in the intensive care unit. I-RNN provides explanations in the form of global and local feature importances comparable to highly intelligible models like decision trees trained on hand-engineered features while significantly outperforming them. I-RNN remains intelligible while providing accuracy comparable to state-of-the-art decay-based and interpolation-based recurrent time series models. The experimental results on real-world clinical datasets refute the myth that there is a tradeoff between accuracy and interpretability.
翻译:具有经常性神经网络( RNN) 的时间序列模型可以具有很高的准确性,但不幸的是,由于特征互动、时间互动和非线性变异,很难解释。 在诸如医疗保健等领域,为了验证和信任模型预测,需要建立能提供所学关系洞察的模型。 我们需要准确的时间序列模型,让用户能够理解单个输入特征的贡献。 我们展示了可解释性- RNN( I- RNN),它通过迫使模型中变量之间的关系成为累加性来平衡模型的准确性和准确性。 RNN的隐藏状态和在最后一步中累加性地组合的经常性变异性。 I- RNNN具体地捕捉临床序列的独特性特征,这些模型在时间序列中进行不均匀的取样,在时间序列中进行不均匀化的抽样,在缺少数据。 隐藏状态启动是一些与预测目标相联的特征系数,并且可以作为基于风险的曲线曲线,在单个输入特性和结果之间形成一种全球关系。 我们评估了2012年Physion Stal 中IM 的I- Rodal IM 的模型的精确临床序列解释,同时提供一个可比较性数据,在真实的模型中提供真实性数据,在实时的模型中,在实时的模型中提供真实性模型中提供真实的预测。