Detecting anomalous time series is key for scientific, medical and industrial tasks, but is challenging due to its inherent unsupervised nature. In recent years, progress has been made on this task by learning increasingly more complex features, often using deep neural networks. In this work, we argue that shallow features suffice when combined with distribution distance measures. Our approach models each time series as a high dimensional empirical distribution of features, where each time-point constitutes a single sample. Modeling the distance between a test time series and the normal training set therefore requires efficiently measuring the distance between multivariate probability distributions. We show that by parameterizing each time series using cumulative Radon features, we are able to efficiently and effectively model the distribution of normal time series. Our theoretically grounded but simple-to-implement approach is evaluated on multiple datasets and shown to achieve better results than established, classical methods as well as complex, state-of-the-art deep learning methods. Code is provided.
翻译:检测异常时间序列是科学、医学和工业任务的关键,但由于其内在的不受监督的性质而具有挑战性。近年来,通过学习日益复杂的特征,并经常使用深神经网络,在这项任务上取得了进展。在这项工作中,我们争辩说,光质特征与分布距离测量相结合就足够了。我们的方法模型每个时间序列作为特征的高度经验分布,每个时间点构成一个单一样本。因此,模拟测试时间序列和正常培训组之间的距离需要有效地测量多变概率分布之间的距离。我们通过使用累积的 Radon 特征对每个时间序列进行参数参数化,我们证明我们能够高效率和有效地模拟正常时间序列的分布。我们基于理论但简单到执行的方法在多个数据集上进行评估,并展示出比既定的经典方法以及复杂而先进的深层学习方法取得更好的结果。我们提供了代码。