In this paper, we propose a dimensionless anomaly detection method for multivariate streams. Our method is independent of the unit of measurement for the different stream channels, therefore dimensionless. We first propose the variance norm, a generalisation of Mahalanobis distance to handle infinite-dimensional feature space and singular empirical covariance matrix rigorously. We then combine the variance norm with the path signature, an infinite collection of iterated integrals that provide global features of streams, to propose SigMahaKNN, a method for anomaly detection on (multivariate) streams. We show that SigMahaKNN is invariant to stream reparametrisation, stream concatenation and has a graded discrimination power depending on the truncation level of the path signature. We implement SigMahaKNN as an open-source software, and perform extensive numerical experiments, showing significantly improved anomaly detection on streams compared to isolation forest and local outlier factors in applications ranging from language analysis, hand-writing analysis, ship movement paths analysis and univariate time-series analysis.
翻译:暂无翻译