Given a machine learning (ML) model and a prediction, explanations can be defined as sets of features which are sufficient for the prediction. In some applications, and besides asking for an explanation, it is also critical to understand whether sensitive features can occur in some explanation, or whether a non-interesting feature must occur in all explanations. This paper starts by relating such queries respectively with the problems of relevancy and necessity in logic-based abduction. The paper then proves membership and hardness results for several families of ML classifiers. Afterwards the paper proposes concrete algorithms for two classes of classifiers. The experimental results confirm the scalability of the proposed algorithms.
翻译:鉴于机器学习模型和预测,解释可以定义为足以预测的成套特征。在某些应用中,除了要求解释,还必须了解敏感特征是否在某些解释中出现,或者是否必须在所有解释中出现非感兴趣的特征。本文首先将这些问题分别与逻辑绑架的关联性和必要性问题联系起来。本文随后证明ML分类人员的若干家庭的成员和硬性结果。随后,本文件为两类分类人员提出了具体算法。实验结果证实了提议的算法的可扩展性。