Deep learning models have achieved impressive prediction performance but often sacrifice interpretability, a critical consideration in high-stakes domains such as healthcare or policymaking. In contrast, generalized additive models (GAMs) can maintain interpretability but often suffer from poor prediction performance due to their inability to effectively capture feature interactions. In this work, we aim to bridge this gap by using pre-trained neural language models to extract embeddings for each input before learning a linear model in the embedding space. The final model (which we call Emb-GAM) is a transparent, linear function of its input features and feature interactions. Leveraging the language model allows Emb-GAM to learn far fewer linear coefficients, model larger interactions, and generalize well to novel inputs (e.g. unseen ngrams in text). Across a variety of natural-language-processing datasets, Emb-GAM achieves strong prediction performance without sacrificing interpretability. All code is made available on Github.
翻译:深层学习模型已经取得了令人印象深刻的预测性表现,但往往可以牺牲解释性,这是保健或决策等高取量领域的重要考虑因素。相比之下,通用添加模型(GAMs)可以保持可解释性,但往往由于无法有效捕捉特征互动而导致预测性表现不佳。在这项工作中,我们的目标是缩小这一差距,在学习嵌入空间的线性模型之前,使用预先培训的神经语言模型为每项输入提取嵌入嵌入。最后模型(我们称之为Emb-GAM)是其输入特征和特征互动的一个透明、线性功能。利用语言模型使Emb-GAM能够学习远为更少的线性系数、模型更大的互动性,并广泛了解新的投入(如文本中的看不见 ngrams)。在各种自然语言处理数据集中,Emb-GAM在不牺牲解释性的前提下实现了很强的预测性表现。所有代码都在Github上公布。