Beta coefficients for linear regression models represent the ideal form of an interpretable feature effect. However, for non-linear models and especially generalized linear models, the estimated coefficients cannot be interpreted as a direct feature effect on the predicted outcome. Hence, marginal effects are typically used as approximations for feature effects, either in the shape of derivatives of the prediction function or forward differences in prediction due to a change in a feature value. While marginal effects are commonly used in many scientific fields, they have not yet been adopted as a model-agnostic interpretation method for machine learning models. This may stem from their inflexibility as a univariate feature effect and their inability to deal with the non-linearities found in black box models. We introduce a new class of marginal effects termed forward marginal effects. We argue to abandon derivatives in favor of better-interpretable forward differences. Furthermore, we generalize marginal effects based on forward differences to multivariate changes in feature values. To account for the non-linearity of prediction functions, we introduce a non-linearity measure for marginal effects. We argue against summarizing feature effects of a non-linear prediction function in a single metric such as the average marginal effect. Instead, we propose to partition the feature space to compute conditional average marginal effects on feature subspaces, which serve as conditional feature effect estimates.
翻译:线性回归模型的贝塔系数代表了可解释特征效应的理想形式。然而,对于非线性模型,特别是一般线性模型,估计系数不能被解释为对预测结果的直接特征效应。因此,边际效应通常用作特征效应的近似值,无论是预测函数衍生物的形状,还是因特征值变化而在预测方面的前向差异。虽然许多科学领域通常使用边际效应,但尚未被采纳为机器学习模型的模型的模型-不可辨别方法。对于机器学习模型来说,这可能是因为它们不灵活作为单线性特征效应和无法处理黑箱模型中发现的非线性特征效应。我们引入了称为前向边缘效应的新一类边际效应的近似值。我们主张放弃边际效应的衍生物,以更好的互换前向差异为根据。此外,我们将基于远端差异的边际效应概括为特征的多变异性特征。我们为边际效应引入了非线性测量尺度测量。我们反对总结非线性特征的特征效应,而将单面性平均空间预测功能作为一线性模型的边际模型,作为我们作为边际平均模型的副模型。