Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges in an online manner, for the purpose of detecting unusual behavior, using constant time and memory? Existing approaches aim to detect individually surprising edges. In this work, we propose MIDAS, which focuses on detecting microcluster anomalies, or suddenly arriving groups of suspiciously similar edges, such as lockstep behavior, including denial of service attacks in network traffic data. We further propose MIDAS-F, to solve the problem by which anomalies are incorporated into the algorithm's internal states, creating a `poisoning' effect that can allow future anomalies to slip through undetected. MIDAS-F introduces two modifications: 1) We modify the anomaly scoring function, aiming to reduce the `poisoning' effect of newly arriving edges; 2) We introduce a conditional merge step, which updates the algorithm's data structures after each time tick, but only if the anomaly score is below a threshold value, also to reduce the `poisoning' effect. Experiments show that MIDAS-F has significantly higher accuracy than MIDAS. MIDAS has the following properties: (a) it detects microcluster anomalies while providing theoretical guarantees about its false positive probability; (b) it is online, thus processing each edge in constant time and constant memory, and also processes the data orders-of-magnitude faster than state-of-the-art approaches; (c) it provides up to 62% higher ROC-AUC than state-of-the-art approaches.
翻译:以动态图形的图表边缘为例, 我们如何在网上将异常分数分配到边缘, 以便用恒定的时间和记忆探测异常行为? 现有方法旨在探测个别惊人的边缘。 在这项工作中, 我们提议MIDAS, 重点是探测微集异常, 或突然到达可疑的边缘群体, 如锁定行为, 包括拒绝网络交通数据中的服务攻击。 我们进一步提议 MIDAS- F, 以解决异常值纳入算法内部状态的问题, 从而产生“ 中毒” 效应, 使得未来的异常能通过不被发现的方式滑落。 MIDAS- F 提出了两项修改:(1) 我们修改异常分数函数, 目的是减少新到达边缘的“ 威胁” 效应; (2) 我们引入一个有条件的合并步骤, 在每次勾选后更新算法数据结构, 但只有当异常分低于临界值时, 也降低“ 中毒” 效果。 实验显示MIDAS- F 的精确度大大高于MIDAS 。 MIDAS- 的精确度, 并且 提供了不断的精确性数据处理。 因此, MIDAS- 提供了不断的概率 。