在时间序列中检测异常现象的在线虚假发现率控制 (Online false discovery rate control for anomaly detection in time series)

This article proposes novel rules for false discovery rate control (FDRC) geared towards online anomaly detection in time series. Online FDRC rules allow to control the properties of a sequence of statistical tests. In the context of anomaly detection, the null hypothesis is that an observation is normal and the alternative is that it is anomalous. FDRC rules allow users to target a lower bound on precision in unsupervised settings. The methods proposed in this article overcome short-comings of previous FDRC rules in the context of anomaly detection, in particular ensuring that power remains high even when the alternative is exceedingly rare (typical in anomaly detection) and the test statistics are serially dependent (typical in time series). We show the soundness of these rules in both theory and experiments.

翻译：本条提出了针对时间序列中在线异常点检测的虚假发现率控制(FDRC)新新规则。在线FDRC规则允许控制统计测试序列的特性。在异常点检测方面,无效假设是观察是正常的,替代是不正常的。FDRC规则允许用户在未受监督的环境中以较低程度的精确度为目标。本条建议的方法克服了FDRC规则在异常点检测方面的缺陷,特别是确保即使在替代方法极为罕见(异常点检测通常)和测试统计数据具有序列依赖性(时间序列通常)的情况下,权力仍然很高。我们在理论和实验中都表明了这些规则的正确性。

相关内容

异常检测

关注 102

在数据挖掘中，异常检测（英语：anomaly detection）对不符合预期模式或数据集中其他项目的项目、事件或观测值的识别。通常异常项目会转变成银行欺诈、结构缺陷、医疗问题、文本错误等类型的问题。异常也被称为离群值、新奇、噪声、偏差和例外。特别是在检测滥用与网络入侵时，有趣性对象往往不是罕见对象，但却是超出预料的突发活动。这种模式不遵循通常统计定义中把异常点看作是罕见对象，于是许多异常检测方法（特别是无监督的方法）将对此类数据失效，除非进行了合适的聚集。相反，聚类分析算法可能可以检测出这些模式形成的微聚类。有三大类异常检测方法。[1] 在假设数据集中大多数实例都是正常的前提下，无监督异常检测方法能通过寻找与其他数据最不匹配的实例来检测出未标记测试数据的异常。监督式异常检测方法需要一个已经被标记“正常”与“异常”的数据集，并涉及到训练分类器（与许多其他的统计分类问题的关键区别是异常检测的内在不均衡性）。半监督式异常检测方法根据一个给定的正常训练数据集创建一个表示正常行为的模型，然后检测由学习模型生成的测试实例的可能性。

生成式对抗网络异常检测，GANs for Anomaly Detection

专知会员服务

34+阅读 · 2021年9月16日

熊辉等首篇「深度学习图异常检测」综述论文，176篇文献全面概述GAD技术

专知会员服务

83+阅读 · 2021年6月23日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

专知会员服务

22+阅读 · 2020年4月16日