The development of neural networks for clinical artificial intelligence (AI) is reliant on interpretability, transparency, and performance. The need to delve into the black-box neural network and derive interpretable explanations of model output is paramount. A task of high clinical importance is predicting the likelihood of a patient being readmitted to hospital in the near future to enable efficient triage. With the increasing adoption of electronic health records (EHRs), there is great interest in applications of natural language processing (NLP) to clinical free-text contained within EHRs. In this work, we apply InfoCal, the current state-of-the-art model that produces extractive rationales for its predictions, to the task of predicting hospital readmission using hospital discharge notes. We compare extractive rationales produced by InfoCal to competitive transformer-based models pretrained on clinical text data and for which the attention mechanism can be used for interpretation. We find each presented model with selected interpretability or feature importance methods yield varying results, with clinical language domain expertise and pretraining critical to performance and subsequent interpretability.
翻译:发展临床人工智能神经网络(AI)取决于可解释性、透明度和性能。必须深入黑盒神经网络,对模型输出作出可解释的解释。一项具有高度临床重要性的任务是预测病人在不久的将来重新住院的可能性,以便能够进行有效的分类。随着电子健康记录(EHRs)的日益采用,人们对自然语言处理(NLP)应用于EHR内包含的临床自由文本非常感兴趣。在这项工作中,我们应用目前最先进的模型InfoCal,该模型为预测该模型的预测提供采掘性原理,用医院排放记录预测医院的可读性。我们比较了InfoCal制作的采掘性原理,用基于临床文本数据的竞争性变压器模型,为此,可以使用关注机制进行解释。我们发现,每种有选择的可解释性或特征重要性方法都会产生不同的结果,临床语言领域专门知识和对业绩和随后的可解释性能至关重要。