Semi-supervised Anomaly Detection (AD) is a kind of data mining task which aims at learning features from partially-labeled datasets to help detect outliers. In this paper, we classify existing semi-supervised AD methods into two categories: unsupervised-based and supervised-based, and point out that most of them suffer from insufficient exploitation of labeled data and under-exploration of unlabeled data. To tackle these problems, we propose Deep Anomaly Detection and Search (DADS), which applies Reinforcement Learning (RL) to balance exploitation and exploration. During the training process, the agent searches for possible anomalies with hierarchically-structured datasets and uses the searched anomalies to enhance performance, which in essence draws lessons from the idea of ensemble learning. Experimentally, we compare DADS with several state-of-the-art methods in the settings of leveraging labeled known anomalies to detect both other known anomalies and unknown anomalies. Results show that DADS can efficiently and precisely search anomalies from unlabeled data and learn from them, thus achieving good performance.
翻译:半监督的异常探测(AD)是一种数据挖掘任务,目的是从部分标签的数据集中学习特征,以帮助探测外源。在本文中,我们将现有的半监督的自动识别方法分为两类:无监督的基于和监督的基于,并指出,大多数这类方法都受到标签数据开发不足和未标签数据的探索不足的影响。为了解决这些问题,我们建议深海异常探测和搜索(DADS)应用强化学习(RL)来平衡利用和探索。在培训过程中,代理物搜索了可能存在分级结构数据集的异常现象,并利用搜索的异常现象来提高性能,这实质上是从共性学习的概念中吸取教训。我们实验性地将DADS与在利用标签的已知异常现象来检测其他已知异常和未知的异常现象时采用的一些最先进的方法进行比较。结果显示,DDDS能够高效率和准确地从未标签的数据中查找异常现象并从中学习,从而取得良好的业绩。