Machine learning models built on behavioral and textual data can result in highly accurate prediction models, but are often very difficult to interpret. Linear models require investigating thousands of coefficients, while the opaqueness of nonlinear models makes things worse. Rule-extraction techniques have been proposed to combine the desired predictive accuracy of complex "black-box" models with global explainability. However, rule-extraction in the context of high-dimensional, sparse data, where many features are relevant to the predictions, can be challenging, as replacing the black-box model by many rules leaves the user again with an incomprehensible explanation. To address this problem, we develop and test a rule-extraction methodology based on higher-level, less-sparse "metafeatures". We empirically validate the quality of the explanation rules in terms of fidelity, stability, and accuracy over a collection of data sets, and benchmark their performance against rules extracted using the fine-grained behavioral and textual features. A key finding of our analysis is that metafeatures-based explanations are better at mimicking the behavior of the black-box prediction model, as measured by the fidelity of explanations.
翻译:以行为和文字数据为基础的机器学习模型可以导致高度准确的预测模型,但往往很难解释。线性模型需要调查数千个系数,而非线性模型的不透明性则使情况更加糟糕。已经提出规则推广技术,将复杂的“黑箱”模型的预期准确性与全球解释结合起来。然而,在高维、稀少的数据中,规则扩展(其中有许多特征与预测有关)可能具有挑战性,因为用许多规则取代黑箱模型,使用户再次得到无法理解的解释。为了解决这一问题,我们根据更高层次、较不偏差的“形形体”模型制定和测试规则扩展方法。我们用经验验证数据集收集的解释性、稳定性和准确性方面的解释性规则的质量,并根据使用精细的动作和文字特征提取的规则衡量其性能。我们分析的主要发现,基于元性的解释在模拟黑箱预测性预测性模型的行为行为上,以真实性衡量。