B. 无法提供救济的基于归责的解释 (Attribution-based Explanations that Provide Recourse Cannot be Robust)

Different users of machine learning methods require different explanations, depending on their goals. To make machine learning accountable to society, one important goal is to get actionable options for recourse, which allow an affected user to change the decision $f(x)$ of a machine learning system by making limited changes to its input $x$. We formalize this by providing a general definition of recourse sensitivity, which needs to be instantiated with a utility function that describes which changes to the decisions are relevant to the user. This definition applies to local attribution methods, which attribute an importance weight to each input feature. It is often argued that such local attributions should be robust, in the sense that a small change in the input $x$ that is being explained, should not cause a large change in the feature weights. However, we prove formally that it is in general impossible for any single attribution method to be both recourse sensitive and robust at the same time. It follows that there must always exist counterexamples to at least one of these properties. We provide such counterexamples for several popular attribution methods, including LIME, SHAP, Integrated Gradients and SmoothGrad. Our results also cover counterfactual explanations, which may be viewed as attributions that describe a perturbation of $x$. We further discuss possible ways to work around our impossibility result, for instance by allowing the output to consist of sets with multiple attributions, and we provide sufficient conditions for specific classes of continuous functions to be recourse sensitive. Finally, we strengthen our impossibility result for the restricted case where users are only able to change a single attribute of $x$, by providing an exact characterization of the functions $f$ to which impossibility applies.

翻译：机器学习方法的不同用户需要不同的解释,这取决于他们的目标。为使机器学习对社会负责,一个重要的目标是让机器学习方法的不同用户获得可操作的追索选择,让受影响的用户通过对输入量进行有限的修改来改变机器学习系统的决定$f(x)$(x)美元。我们通过提供追索敏感性的一般性定义来正式这一点,这种定义需要通过描述决定变化与用户相关的实用功能来进行即时化。这一定义适用于当地归属方法,它赋予每个输入特征的重要份量。经常有人争辩说,这种本地属性应当是稳健的,即正在解释的输入量美元(x)的微小变化不应导致功能重量的大幅改变。然而,我们正式地证明,任何单一归属方法通常不可能同时利用敏感和稳健的敏感度。因此,至少其中一种属性必须存在反比值。我们为包括LIME、SHAP、IGridients和SligradGrad等大众归属方法提供了反比重。我们的结果也可以通过直截然的推算的推算结果, 我们的推算的推算的推算出一个反推算的推算结果。