Unsupervisedly detecting anomaly points in time series is challenging, which requires the model to learn informative representations and derive a distinguishable criterion. Prior methods mainly detect anomalies based on the recurrent network representation of each time point. However, the point-wise representation is less informative for complex temporal patterns and can be dominated by normal patterns, making rare anomalies less distinguishable. We find that in each time series, each time point can also be described by its associations with all time points, presenting as a point-wise distribution that is more expressive for temporal modeling. We further observe that due to the rarity of anomalies, it is harder for anomalies to build strong associations with the whole series and their associations shall mainly concentrate on the adjacent time points. This observation implies an inherently distinguishable criterion between normal and abnormal points, which we highlight as the \emph{Association Discrepancy}. Technically we propose the \emph{Anomaly Transformer} with an \emph{Anomaly-Attention} mechanism to compute the association discrepancy. A minimax strategy is devised to amplify the normal-abnormal distinguishability of the association discrepancy. Anomaly Transformer achieves state-of-the-art performance on six unsupervised time series anomaly detection benchmarks for three applications: service monitoring, space \& earth exploration, and water treatment.
翻译:在时间序列中,未经监督地发现异常点是具有挑战性的,它要求模型学习信息说明,并得出一个可辨别的标准。以前的方法主要是根据每个时间点的经常性网络代表来检测异常点。然而,点偏偏代表点对于复杂的时间模式不太具有信息性,而且可以以正常模式为主,使得罕见的异常点不那么明显。我们发现,在每一个时间序列中,每个时间点也可以被其与所有时间点的关联点所描述,作为点对时间模型更清晰的分布。我们进一步注意到,由于异常点的罕见性,异常点更难以与整个序列建立牢固的联系,而它们的联系应主要集中于相邻的时间点。这一观察意味着正常点和异常点之间有内在的区别标准,我们强调这是正常点和异常点之间的差别。从技术上讲,我们建议每个时间序列中的每个时间点都具有\emph{异常变异},作为比较时间模型的点分布机制。我们发现,由于异常点的罕见性,我们进一步观察到,异常点对于与整个序列的关联性关联性联系关系,因此很难建立强大的关联性联系性联系关系,它们的关联性将主要集中于集中于集中在时间点。这种变换式处理方法用来扩大了正常性差的正常性对时间差异的可辨测测算。