机器学习模型的行为分析：带有代理决策树的DeforestVis (DeforestVis: Behavior Analysis of Machine Learning Models with Surrogate Decision Stumps)

As the complexity of machine learning (ML) models increases and the applications in different (and critical) domains grow, there is a strong demand for more interpretable and trustworthy ML. One straightforward and model-agnostic way to interpret complex ML models is to train surrogate models, such as rule sets and decision trees, that sufficiently approximate the original ones while being simpler and easier-to-explain. Yet, rule sets can become very lengthy, with many if-else statements, and decision tree depth grows rapidly when accurately emulating complex ML models. In such cases, both approaches can fail to meet their core goal, providing users with model interpretability. We tackle this by proposing DeforestVis, a visual analytics tool that offers user-friendly summarization of the behavior of complex ML models by providing surrogate decision stumps (one-level decision trees) generated with the adaptive boosting (AdaBoost) technique. Our solution helps users to explore the complexity vs fidelity trade-off by incrementally generating more stumps, creating attribute-based explanations with weighted stumps to justify decision making, and analyzing the impact of rule overriding on training instance allocation between one or more stumps. An independent test set allows users to monitor the effectiveness of manual rule changes and form hypotheses based on case-by-case investigations. We show the applicability and usefulness of DeforestVis with two use cases and expert interviews with data analysts and model developers.

翻译：随着机器学习（ML）模型的复杂性增加以及在不同（和关键）领域中的应用增多，对于更具解释性和可信度的ML的需求越来越强烈。一种直观且无拘束的解释复杂ML模型的方式就是训练代理模型，例如规则集和决策树，这些模型能够充分逼近原始模型，同时具有更简单和更容易解释的特点。然而，规则集可以变得非常冗长，有许多if-else教育声明，并且当精确模拟复杂的ML模型时，决策树深度会迅速增长。在这种情况下，两种方法都可能未能实现其核心目标，即为用户提供模型的可解释性。我们通过提出DeforestVis来解决这个问题，这是一种可视化分析工具，通过使用自适应增强（AdaBoost）技术生成代理决策树（一级决策树），以提供复杂ML模型行为的用户友好汇总。我们的解决方案可帮助用户通过逐步生成更多决策树来探索复杂性与保真度之间的平衡，使用加权决策树创建基于属性的说明以证明决策，并分析规则覆盖对训练实例分配之间的影响。独立测试集允许用户监视手动规则更改的有效性，并根据一个个案例形成假设。我们通过两个用例和与数据分析师和模型开发人员的专家访谈展示了DeforestVis的适用性和实用性。