Feature attribution for kernel methods is often heuristic and not individualised for each prediction. To address this, we turn to the concept of Shapley values, a coalition game theoretical framework that has previously been applied to different machine learning model interpretation tasks, such as linear models, tree ensembles and deep networks. By analysing Shapley values from a functional perspective, we propose \textsc{RKHS-SHAP}, an attribution method for kernel machines that can efficiently compute both \emph{Interventional} and \emph{Observational Shapley values} using kernel mean embeddings of distributions. We show theoretically that our method is robust with respect to local perturbations - a key yet often overlooked desideratum for interpretability. Further, we propose \emph{Shapley regulariser}, applicable to a general empirical risk minimisation framework, allowing learning while controlling the level of specific feature's contributions to the model. We demonstrate that the Shapley regulariser enables learning which is robust to covariate shift of a given feature and fair learning which controls the Shapley values of sensitive features.
翻译:内核方法的特性属性通常是超自然化的, 而不是对每种预测进行个性化的。 为了解决这个问题, 我们转而研究Shapley 值的概念, 这是一种联合游戏理论框架, 以前适用于不同的机器学习模型解释任务, 如线性模型、 树群和深网络。 通过从功能角度分析 Shapley 值, 我们提议 \ textsc{ RKHS- SHAP}, 一种内核机器的特性属性方法, 可以有效地计算 \ emph{ Interventional} 和\ emph{ Observational Shapley 值 。 我们从理论上表明, 我们的方法在本地的扰动上是稳健的- 一种关键但经常被忽视的可解释性边际 。 此外, 我们提议 \ emph{ Shaple manticr}, 适用于一般的经验风险最小化框架, 允许在控制特定特性对模型的贡献程度的同时学习。 我们证明 Shaple Reminiser 能够学习 学习能够使特定特性和公平控制特性特性的特性的特性变化。