矩阵概况二十七:比较长时系列的新远程措施 (Matrix Profile XXVII: A Novel Distance Measure for Comparing Long Time Series)

The most useful data mining primitives are distance measures. With an effective distance measure, it is possible to perform classification, clustering, anomaly detection, segmentation, etc. For single-event time series Euclidean Distance and Dynamic Time Warping distance are known to be extremely effective. However, for time series containing cyclical behaviors, the semantic meaningfulness of such comparisons is less clear. For example, on two separate days the telemetry from an athlete workout routine might be very similar. The second day may change the order in of performing push-ups and squats, adding repetitions of pull-ups, or completely omitting dumbbell curls. Any of these minor changes would defeat existing time series distance measures. Some bag-of-features methods have been proposed to address this problem, but we argue that in many cases, similarity is intimately tied to the shapes of subsequences within these longer time series. In such cases, summative features will lack discrimination ability. In this work we introduce PRCIS, which stands for Pattern Representation Comparison in Series. PRCIS is a distance measure for long time series, which exploits recent progress in our ability to summarize time series with dictionaries. We will demonstrate the utility of our ideas on diverse tasks and datasets.

翻译：最有用的数据开采原始是距离测量。如果有效的距离测量, 就可以进行分类、集群、异常检测、分割等。对于单时序时间序列 Euclidean 距离和动态时间扭曲, 已知是极为有效的。但是, 对于包含周期行为的时间序列, 此类比较的语义意义不太清楚。例如, 在两日之间, 运动员锻炼常规的遥测可能非常相似。第二天可能会改变执行俯卧撑和蹲伏的顺序, 增加拉动的重复, 或完全省略哑铃卷曲等。任何这些微小的改变都会挫败现有的时间序列距离测量。已经提出了解决这个问题的一揽子方法, 但我们认为, 在许多情况下, 类似性与这些较长的时间序列中的子序列的形状紧密相连。在这种情况下, 概括性特征将缺乏歧视能力。在这项工作中, 我们引入了PRCIS, 也就是一个长时间序列的远程测量, 将利用我们最近完成的实用性任务来总结我们的数据序列。

相关内容

CASES

关注 0

CASES：International Conference on Compilers, Architectures, and Synthesis for Embedded Systems。 Explanation：嵌入式系统编译器、体系结构和综合国际会议。 Publisher：ACM。 SIT： http://dblp.uni-trier.de/db/conf/cases/index.html

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日