One of the potential solutions for model interpretation is to train a surrogate model: a more transparent model that approximates the behavior of the model to be explained. Typically, classification rules or decision trees are used due to the intelligibility of their logic-based expressions. However, decision trees can grow too deep and rule sets can become too large to approximate a complex model. Unlike paths on a decision tree that must share ancestor nodes (conditions), rules are more flexible. However, the unstructured visual representation of rules makes it hard to make inferences across rules. To address these issues, we present a workflow that includes novel algorithmic and interactive solutions. First, we present Hierarchical Surrogate Rules (HSR), an algorithm that generates hierarchical rules based on user-defined parameters. We also contribute SuRE, a visual analytics (VA) system that integrates HSR and interactive surrogate rule visualizations. Particularly, we present a novel feature-aligned tree to overcome the shortcomings of existing rule visualizations. We evaluate the algorithm in terms of parameter sensitivity, time performance, and comparison with surrogate decision trees and find that it scales reasonably well and outperforms decision trees in many respects. We also evaluate the visualization and the VA system by a usability study with 24 volunteers and an observational study with 7 domain experts. Our investigation shows that the participants can use feature-aligned trees to perform non-trivial tasks with very high accuracy. We also discuss many interesting observations that can be useful for future research on designing effective rule-based VA systems.
翻译:模型解释的潜在解决办法之一是培养代谢模型:一个更透明的模型,它接近模型的行为,需要解释模型的行为。通常,使用分类规则或决策树是因为其逻辑表达方式的智能性。然而,决策树可能发展得太深,规则组可能变得太大,无法接近复杂的模型。与必须分享祖先节点(条件)的决策树上的道路不同,规则更灵活。然而,规则的非结构化直观代表使得很难在规则中作出推理。为了解决这些问题,我们提出了一个包括新颖的算法和互动解决办法的工作流程。首先,我们提出高层次代谢规则(HSR),这种算法产生基于用户定义参数的等级规则。我们还协助Surre,一个视觉分析(VA)系统,将HSR和互动代言规则的可视化(条件)集成成。我们展示了一个新颖的、符合特征的树,以克服现有规则直观的缺陷。我们从参数敏感度、时间性业绩和比较的角度评估了参数的算法。我们提出了高层次的代谢规则规则规则规则规则(HSR)规则(HSR)规则(HSRSR)规则)规则的算法,我们也可以将研究与许多树和直观研究的可测量研究对象分析师加以评估。