我们何以回过头来实现实时实时反射异常现象的有效探测? (How Far Should We Look Back to Achieve Effective Real-Time Time-Series Anomaly Detection?)

from arxiv, 12 pages, 5 figures, and 9 tables, Proceedings of the 35th International Conference on Advanced Information Network-ing and Applications (AINA 2021)

Anomaly detection is the process of identifying unexpected events or ab-normalities in data, and it has been applied in many different areas such as system monitoring, fraud detection, healthcare, intrusion detection, etc. Providing real-time, lightweight, and proactive anomaly detection for time series with neither human intervention nor domain knowledge could be highly valuable since it reduces human effort and enables appropriate countermeasures to be undertaken before a disastrous event occurs. To our knowledge, RePAD (Real-time Proactive Anomaly Detection algorithm) is a generic approach with all above-mentioned features. To achieve real-time and lightweight detection, RePAD utilizes Long Short-Term Memory (LSTM) to detect whether or not each upcoming data point is anomalous based on short-term historical data points. However, it is unclear that how different amounts of historical data points affect the performance of RePAD. Therefore, in this paper, we investigate the impact of different amounts of historical data on RePAD by introducing a set of performance metrics that cover novel detection accuracy measures, time efficiency, readiness, and resource consumption, etc. Empirical experiments based on real-world time series datasets are conducted to evaluate RePAD in different scenarios, and the experimental results are presented and discussed.

翻译：异常探测是查明数据中意外事件或异常现象的过程,已经应用于许多不同领域,如系统监测、欺诈检测、医疗、入侵探测等,例如系统监测、欺诈检测、保健、入侵探测等。提供实时、轻量和主动异常探测时间序列,而人类干预和领域知识都没有,因此可能非常宝贵,因为它会减少人类的努力,并能够在灾难性事件发生之前采取适当的对策。据我们所知,RePAD(实时主动异常探测算法)是具有上述所有特征的通用方法。为了实现实时和轻度检测,REPAD利用长期短期内存(LSTM)来检测每个即将到来的数据点是否基于短期历史数据点的异常。然而,尚不清楚的是,不同数量的历史数据点如何影响ResPAD的运行。因此,在本文件中,我们调查不同数量的历史数据对REPAD的影响,方法是采用一套包括新发现的准确度度、时间效率、准备和资源消耗等的性能指标。根据真实和已讨论过的实验时间序列中的不同情况,对现实和实验结果进行评估。

相关内容

异常检测

关注 102

在数据挖掘中，异常检测（英语：anomaly detection）对不符合预期模式或数据集中其他项目的项目、事件或观测值的识别。通常异常项目会转变成银行欺诈、结构缺陷、医疗问题、文本错误等类型的问题。异常也被称为离群值、新奇、噪声、偏差和例外。特别是在检测滥用与网络入侵时，有趣性对象往往不是罕见对象，但却是超出预料的突发活动。这种模式不遵循通常统计定义中把异常点看作是罕见对象，于是许多异常检测方法（特别是无监督的方法）将对此类数据失效，除非进行了合适的聚集。相反，聚类分析算法可能可以检测出这些模式形成的微聚类。有三大类异常检测方法。[1] 在假设数据集中大多数实例都是正常的前提下，无监督异常检测方法能通过寻找与其他数据最不匹配的实例来检测出未标记测试数据的异常。监督式异常检测方法需要一个已经被标记“正常”与“异常”的数据集，并涉及到训练分类器（与许多其他的统计分类问题的关键区别是异常检测的内在不均衡性）。半监督式异常检测方法根据一个给定的正常训练数据集创建一个表示正常行为的模型，然后检测由学习模型生成的测试实例的可能性。

如何构建你的推荐系统？这份21页ppt教程为你讲解

专知会员服务

65+阅读 · 2021年2月12日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

专知会员服务

55+阅读 · 2020年4月26日