Decision trees are well-known due to their ease of interpretability. To improve accuracy, we need to grow deep trees or ensembles of trees. These are hard to interpret, offsetting their original benefits. Shapley values have recently become a popular way to explain the predictions of tree-based machine learning models. It provides a linear weighting to features independent of the tree structure. The rise in popularity is mainly due to TreeShap, which solves a general exponential complexity problem in polynomial time. Following extensive adoption in the industry, more efficient algorithms are required. This paper presents a more efficient and straightforward algorithm: Linear TreeShap. Like TreeShap, Linear TreeShap is exact and requires the same amount of memory.
翻译:决策性树是众所周知的, 因为它们容易解释。 为了提高准确性, 我们需要种植深树或树群。 这些树群很难解释, 很难抵消它们最初的好处 。 沙皮值最近成为解释基于树的机器学习模型预测的流行方式 。 它为独立于树结构的特征提供了线性加权。 流行程度的上升主要是由于TreaShap, 它解决了多民族时期的普遍指数复杂性问题 。 在工业广泛采用后, 需要更有效的算法。 本文展示了一种更有效和直截了当的算法: 线性树沙普 。 与树沙普、 线性树丛沙普 一样, 需要相同的记忆 。