We evaluate two popular local explainability techniques, LIME and SHAP, on a movie recommendation task. We discover that the two methods behave very differently depending on the sparsity of the data set. LIME does better than SHAP in dense segments of the data set and SHAP does better in sparse segments. We trace this difference to the differing bias-variance characteristics of the underlying estimators of LIME and SHAP. We find that SHAP exhibits lower variance in sparse segments of the data compared to LIME. We attribute this lower variance to the completeness constraint property inherent in SHAP and missing in LIME. This constraint acts as a regularizer and therefore increases the bias of the SHAP estimator but decreases its variance, leading to a favorable bias-variance trade-off especially in high sparsity data settings. With this insight, we introduce the same constraint into LIME and formulate a novel local explainabilty framework called Completeness-Constrained LIME (CLIMB) that is superior to LIME and much faster than SHAP.
翻译:我们根据电影建议任务评估两种流行的本地解释技术:LIME和SHAP。我们发现这两种方法在数据集的宽度上表现非常不同。 LIME在数据集密密部分比SHAP好, SHAP在稀疏部分比SHAP好。我们追踪这一差异与LIME和SHAP基本估测员不同的偏差差异特征。我们发现SHAP在数据稀疏部分与LIME相比差异较小。我们把这个差异较低归因于SHAP所固有的完整性限制属性,而在LIME中则缺少。这种制约作用是一种常规化作用,因此增加了SHAP估计器的偏差,但减少了其偏差,导致有利于偏差的偏差交易,特别是在高宽度数据环境中。我们通过这种洞察力,对LIME和SHAP引入了同样的约束,并制定了一个叫做“完整-受约束的LIME(CLIMB)”的新的本地解释性框架,它优于LIME,比SHAP要快得多。