The majority of existing post-hoc explanation approaches for machine learning models produce independent per-variable feature attribution scores, ignoring a critical characteristic, such as the inter-variable relationship between features that naturally occurs in visual and textual data. In response, we develop a novel model-agnostic and permutation-based feature attribution algorithm based on the relational analysis between input variables. As a result, we are able to gain a broader insight into machine learning model decisions and data. This type of local explanation measures the effects of interrelationships between local features, which provides another critical aspect of explanations. Experimental evaluations of our framework using setups involving both image and text data modalities demonstrate its effectiveness and validity.
翻译:对机器学习模型的现有多数后热解解释方法产生独立的每个可变特性归属分数,忽视一个关键特征,例如视觉数据和文本数据自然产生的特征之间可变关系。作为回应,我们根据对输入变量之间的关系分析,开发了一种新的模型不可知和基于变异特性属性算法。结果,我们能够更广泛地了解机器学习模型的决定和数据。这种本地解释方法衡量地方特征之间的相互关系的效果,提供了另一个关键的解释。我们框架的实验性评价,利用图像和文本数据模式的设置,证明了其有效性和有效性。