Recent work has unveiled a theory for reasoning about the decisions made by binary classifiers: a classifier describes a Boolean function, and the reasons behind an instance being classified as positive are the prime-implicants of the function that are satisfied by the instance. One drawback of these works is that they do not explicitly treat scenarios where the underlying data is known to be constrained, e.g., certain combinations of features may not exist, may not be observable, or may be required to be disregarded. We propose a more general theory, also based on prime-implicants, tailored to taking constraints into account. The main idea is to view classifiers in the presence of constraints as describing partial Boolean functions, i.e., that are undefined on instances that do not satisfy the constraints. We prove that this simple idea results in reasons that are no less (and sometimes more) succinct. That is, not taking constraints into account (e.g., ignored, or taken as negative instances) results in reasons that are subsumed by reasons that do take constraints into account. We illustrate this improved parsimony on synthetic classifiers and classifiers learned from real data.
翻译:最近的工作揭示了对二分分类者所作决定进行推理的理论:一个分类者描述的是布林函数,而将一个实例归类为正面功能背后的理由是该实例所满足的功能的主要促进者。这些作品的一个缺点是,它们没有明确处理已知基础数据受到限制的假设情景,例如,某些特征组合可能不存在,可能无法观测,或可能被要求忽略。我们提出了一个更笼统的理论,也以初级浸泡者为基础,专门为制约因素所定制。我们的主要想法是将存在制约因素的分类者视为描述部分布林函数,即未在不满足这些制约的情况下加以界定。我们证明,这一简单的想法导致的原因不亚于(有时更简洁),即没有考虑到制约因素(例如,忽视,或视为负面实例),其原因被包含在有制约因素的情况下。我们举例说明了合成分类者和分类者从真实数据中学习的这种改进的精度。