Recent works in Explainable AI mostly address the transparency issue of black-box models or create explanations for any kind of models (i.e., they are model-agnostic), while leaving explanations of interpretable models largely underexplored. In this paper, we fill this gap by focusing on explanations for a specific interpretable model, namely pattern-based logistic regression (PLR) for binary text classification. We do so because, albeit interpretable, PLR is challenging when it comes to explanations. In particular, we found that a standard way to extract explanations from this model does not consider relations among the features, making the explanations hardly plausible to humans. Hence, we propose AXPLR, a novel explanation method using (forms of) computational argumentation to generate explanations (for outputs computed by PLR) which unearth model agreements and disagreements among the features. Specifically, we use computational argumentation as follows: we see features (patterns) in PLR as arguments in a form of quantified bipolar argumentation frameworks (QBAFs) and extract attacks and supports between arguments based on specificity of the arguments; we understand logistic regression as a gradual semantics for these QBAFs, used to determine the arguments' dialectic strength; and we study standard properties of gradual semantics for QBAFs in the context of our argumentative re-interpretation of PLR, sanctioning its suitability for explanatory purposes. We then show how to extract intuitive explanations (for outputs computed by PLR) from the constructed QBAFs. Finally, we conduct an empirical evaluation and two experiments in the context of human-AI collaboration to demonstrate the advantages of our resulting AXPLR method.
翻译:解释性大赦国际最近的工作大多涉及黑箱模型的透明度问题,或为任何类型的模型(即模型不可知性)做出解释,而对于解释性模型的解释则基本上未得到充分探讨。在本文中,我们填补这一空白的方法是侧重于解释具体可解释模型的解释,即基于模式的后勤回归(PLR)的二元文本分类。我们这样做是因为,尽管可以解释,但PLR在解释时具有挑战性。特别是,我们发现,从这一模型中提取解释解释的标准方法并不考虑各种特征之间的关系,使解释对人类来说不那么合理。因此,我们建议AXPLR,这是一种新解释性解释方法,使用(形式)计算性参数来产生解释性解释(由PLR计算的产出 ) 。具体地说,我们将PLRR(模式)的特性视为量化的双极解释性论证框架(QBAF), 将攻击和争论的精确性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性 Q;我们理解性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性解释性