The computation of the distance of two time series is time-consuming for any elastic distance function that accounts for misalignments. Among those functions, DTW is the most prominent. However, a recent extensive evaluation has shown that the move-split merge (MSM) metric is superior to DTW regarding the analytical accuracy of the 1-NN classifier. Unfortunately, the running time of the standard dynamic programming algorithm for MSM distance computation is $\Omega(n^2)$, where $n$ is the length of the longest time series. In this paper, we provide approaches to reducing the cost of MSM distance computations by using lower and upper bounds for early pruning paths in the underlying dynamic programming table. For the case of one time series being a constant, we present a linear-time algorithm. In addition, we propose new linear-time heuristics and adapt heuristics known from DTW to computing the MSM distance. One heuristic employs the metric property of MSM and the previously introduced linear-time algorithm. Our experimental studies demonstrate substantial speed-ups in our approaches compared to previous MSM algorithms. In particular, the running time for MSM is faster than a state-of-the-art DTW distance computation for a majority of the popular UCR data sets.
翻译:摘要:对于任何考虑对齐偏差的弹性距离函数,计算两个时间序列的距离是耗时的。在这些函数中,DTW是最突出的。然而,最近的广泛评估表明,相对于DTW,move-split merge(MSM)度量在1-NN分类器的分析准确性方面优越。不幸的是,用于MSM距离计算的标准动态规划算法的运行时间为$O(n^2)$,其中$n$是最长时间序列的长度。在本文中,我们提供了降低MSM距离计算成本的方法,通过在底层动态规划表中使用下界和上界来提前修剪路径。对于一个时间序列是恒定的情况下,我们提出了一个线性时间算法。此外,我们提出了新的线性时间启发式,并从DTW中借鉴已知启发式来计算MSM距离。一个启发式利用了MSM的度量属性和先前介绍的线性时间算法。我们的实验研究表明,与先前的MSM算法相比,我们的方法大大加快了速度。特别是,在大多数流行的UCR数据集上,MSM的运行时间比最先进的DTW距离计算更快。