Anomaly detection is concerned with identifying examples in a dataset that do not conform to the expected behaviour. While a vast amount of anomaly detection algorithms exist, little attention has been paid to explaining why these algorithms flag certain examples as anomalies. However, such an explanation could be extremely useful to anyone interpreting the algorithms' output. This paper develops a method to explain the anomaly predictions of the state-of-the-art Isolation Forest anomaly detection algorithm. The method outputs an explanation vector that captures how important each attribute of an example is to identifying it as anomalous. A thorough experimental evaluation on both synthetic and real-world datasets shows that our method is more accurate and more efficient than most contemporary state-of-the-art explainability methods.
翻译:异常探测涉及在一个不符合预期行为的数据集中找出与预期行为不符的示例。虽然存在大量异常检测算法,但很少注意解释为什么这些算法将某些例子标出为异常现象。然而,这种解释对解释算法输出的任何人可能极为有用。本文开发了一种方法来解释对最新隔离森林异常检测算法的异常预测。该方法产生一种解释矢量,说明一个示例的每个属性对于将其识别为异常现象是多么重要。对合成和真实世界数据集的彻底实验评估表明,我们的方法比大多数当代最先进的解释方法更准确,效率更高。