Many time series data mining problems can be solved with repeated use of distance measure. Examples of such tasks include similarity search, clustering, classification, anomaly detection and segmentation. For over two decades it has been known that the Dynamic Time Warping (DTW) distance measure is the best measure to use for most tasks, in most domains. Because the classic DTW algorithm has quadratic time complexity, many ideas have been introduced to reduce its amortized time, or to quickly approximate it. One of the most cited approximate approaches is FastDTW. The FastDTW algorithm has well over a thousand citations and has been explicitly used in several hundred research efforts. In this work, we make a surprising claim. In any realistic data mining application, the approximate FastDTW is much slower than the exact DTW. This fact clearly has implications for the community that uses this algorithm: allowing it to address much larger datasets, get exact results, and do so in less time.
翻译:许多时间序列数据挖掘问题可以通过反复使用距离测量来解决,例如,类似搜索、集群、分类、异常探测和分割等任务。20多年来,已知动态时间扭曲(DTW)距离测量是大多数领域大多数任务的最佳衡量标准。由于传统的DTW算法具有四重时间复杂性,因此引入了许多想法来减少其摊销时间或快速接近时间。最引人注意的近似方法之一是快速DTW。快速DTW算法有一千多个引用,在数百项研究工作中被明确使用。在这项工作中,我们提出一个令人惊讶的主张。在任何现实的数据挖掘应用中,快速DTW的近似速度比完全的DTW要慢得多。这一事实显然对使用这种算法的社区有影响:允许它处理大得多的数据集,获得准确的结果,而且时间更少。