DeepFIB: 时间序列自动利用 (DeepFIB: Self-Imputation for Time Series Anomaly Detection)

Time series (TS) anomaly detection (AD) plays an essential role in various applications, e.g., fraud detection in finance and healthcare monitoring. Due to the inherently unpredictable and highly varied nature of anomalies and the lack of anomaly labels in historical data, the AD problem is typically formulated as an unsupervised learning problem. The performance of existing solutions is often not satisfactory, especially in data-scarce scenarios. To tackle this problem, we propose a novel self-supervised learning technique for AD in time series, namely \emph{DeepFIB}. We model the problem as a \emph{Fill In the Blank} game by masking some elements in the TS and imputing them with the rest. Considering the two common anomaly shapes (point- or sequence-outliers) in TS data, we implement two masking strategies with many self-generated training samples. The corresponding self-imputation networks can extract more robust temporal relations than existing AD solutions and effectively facilitate identifying the two types of anomalies. For continuous outliers, we also propose an anomaly localization algorithm that dramatically reduces AD errors. Experiments on various real-world TS datasets demonstrate that DeepFIB outperforms state-of-the-art methods by a large margin, achieving up to $65.2\%$ relative improvement in F1-score.

翻译：时间序列(TS)异常检测(AD)在各种应用中发挥着必不可少的作用,例如,在金融和医疗保健监测中发现欺诈。由于异常现象固有的不可预测性质和差异性很大,历史数据中缺乏异常标签,AD问题通常被视为一个不受监督的学习问题。现有解决方案的性能往往不尽人意,特别是在数据残缺的情景中。为了解决这一问题,我们提议了一种新的自动监督学习技术,用于实时序列中的反倾销,即:\emph{DeepFIB}。我们把问题模拟成一个\emph{Fill in Blank}游戏,掩盖TS中的某些元素,并将其与其余元素进行估算。考虑到TS数据中两种常见的异常形状(点或序列外出者),我们用许多自制的培训样本执行两种遮掩战略。相应的自我估计网络可以比现有的自动解决方案更牢固的时间关系,并有效地帮助识别两种异常类型。对于连续的外差,我们还建议一种异常的本地本地化算法性算法性,大大降低了FIFIFA的相对差位数。我们用各种实际状态来显示一个巨大的方法。

相关内容

异常检测

关注 102

在数据挖掘中，异常检测（英语：anomaly detection）对不符合预期模式或数据集中其他项目的项目、事件或观测值的识别。通常异常项目会转变成银行欺诈、结构缺陷、医疗问题、文本错误等类型的问题。异常也被称为离群值、新奇、噪声、偏差和例外。特别是在检测滥用与网络入侵时，有趣性对象往往不是罕见对象，但却是超出预料的突发活动。这种模式不遵循通常统计定义中把异常点看作是罕见对象，于是许多异常检测方法（特别是无监督的方法）将对此类数据失效，除非进行了合适的聚集。相反，聚类分析算法可能可以检测出这些模式形成的微聚类。有三大类异常检测方法。[1] 在假设数据集中大多数实例都是正常的前提下，无监督异常检测方法能通过寻找与其他数据最不匹配的实例来检测出未标记测试数据的异常。监督式异常检测方法需要一个已经被标记“正常”与“异常”的数据集，并涉及到训练分类器（与许多其他的统计分类问题的关键区别是异常检测的内在不均衡性）。半监督式异常检测方法根据一个给定的正常训练数据集创建一个表示正常行为的模型，然后检测由学习模型生成的测试实例的可能性。

生成式对抗网络异常检测，GANs for Anomaly Detection

专知会员服务

34+阅读 · 2021年9月16日

【深度伪造综述论文】The Creation and Detection of Deepfakes: A Survey

专知会员服务

55+阅读 · 2020年4月26日

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

专知会员服务

31+阅读 · 2020年4月13日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日