The detection of anomalous behaviours is an emerging need in many applications, particularly in contexts where security and reliability are critical aspects. While the definition of anomaly strictly depends on the domain framework, it is often impractical or too time consuming to obtain a fully labelled dataset. The use of unsupervised models to overcome the lack of labels often fails to catch domain specific anomalies as they rely on general definitions of outlier. This paper suggests a new active learning based approach, ALIF, to solve this problem by reducing the number of required labels and tuning the detector towards the definition of anomaly provided by the user. The proposed approach is particularly appealing in the presence of a Decision Support System (DSS), a case that is increasingly popular in real-world scenarios. While it is common that DSS embedded with anomaly detection capabilities rely on unsupervised models, they don't have a way to improve their performance: ALIF is able to enhance the capabilities of DSS by exploiting the user feedback during common operations. ALIF is a lightweight modification of the popular Isolation Forest that proved superior performances with respect to other state-of-art algorithms in a multitude of real anomaly detection datasets.
翻译:在许多应用中,特别是在安全和可靠性是关键方面的情况下,发现异常行为是新出现的需要,在很多应用中,特别是在安全和可靠性是关键方面的情况下,发现异常现象是新出现的需要。虽然异常现象的定义严格取决于域框架,但往往不切实际,或过于耗时,以获得贴上完整标签的数据集。使用未经监督的模型来克服缺乏标签的问题,往往无法捕捉依赖外部异常的一般性定义的域别特定异常现象。本文建议一种基于积极学习的新方法,即ALIF,通过减少所需标签的数量,调整检测器以适应用户提供的异常点定义来解决这一问题。 拟议的方法在决策支持系统(DSS)的出现时特别吸引人。 在现实世界情形中,这种情况越来越普遍的情况是,带有异常检测能力的DSS往往依赖未经监督的模式,但它们没有办法改进它们的业绩。 ALIF在共同操作中利用用户的反馈,能够提高DSS的能力。ALIF对流行的隔离森林作了轻量的修改,证明在大量实际异常现象探测数据设置中,其表现优于其他状态的算法。