Generalized additive models (GAMs) have become a leading modelclass for interpretable machine learning. However, there are many algorithms for training GAMs, and these can learn different or even contradictory models, while being equally accurate. Which GAM should we trust? In this paper, we quantitatively and qualitatively investigate a variety of GAM algorithms on real and simulated datasets. We find that GAMs with high feature sparsity (only using afew variables to make predictions) can miss patterns in the data and be unfair to rare subpopulations. Our results suggest that inductive bias plays a crucial role in what interpretable models learn and that tree-based GAMs represent the best balance of sparsity, fidelity and accuracy and thus appear to be the most trustworthy GAM.
翻译:通用添加模型(GAMs)已成为可解释机器学习的主要模型类。然而,培训GAMs有许多算法,这些算法可以学习不同甚至相互矛盾的模式,但同样准确。我们应该相信哪个GAM?在本文中,我们从数量和质量上调查了真实和模拟数据集中的各种GAM算法。我们发现,具有高度特异性的GAMs(仅使用微软变量作出预测)可能会错失数据模式,对稀有子群不公平。我们的结果表明,演化偏向在可解释模型所学的模型中起着关键作用,而基于树的GAMs代表了宽度、忠诚性和准确性的最佳平衡,因此似乎是最可靠的GAM。