Machine learning models on behavioral and textual data can result in highly accurate prediction models, but are often very difficult to interpret. Rule-extraction techniques have been proposed to combine the desired predictive accuracy of complex "black-box" models with global explainability. However, rule-extraction in the context of high-dimensional, sparse data, where many features are relevant to the predictions, can be challenging, as replacing the black-box model by many rules leaves the user again with an incomprehensible explanation. To address this problem, we develop and test a rule-extraction methodology based on higher-level, less-sparse metafeatures. A key finding of our analysis is that metafeatures-based explanations are better at mimicking the behavior of the black-box prediction model, as measured by the fidelity of explanations.
翻译:有关行为和文字数据的机器学习模型可以产生高度准确的预测模型,但往往很难解释。 规则解说技术已经提出将复杂的“黑箱”模型的预期准确性与全球解释结合起来。 然而,在高维、稀少的数据中,规则解说可能具有挑战性,因为许多特征都与预测相关,因为用许多规则取代黑箱模型会使用户再次得到无法理解的解释。为了解决这一问题,我们制定并测试基于更高层次、不太粗略的元体的规则解析方法。我们分析的一个重要发现是,基于超自然的解释在模拟黑箱预测模型行为方面更好,以解释的真实性来衡量。