Subsequence-based time series classification algorithms provide accurate and interpretable models, but training these models is extremely computation intensive. The asymptotic time complexity of subsequence-based algorithms remains a higher-order polynomial, because these algorithms are based on exhaustive search for highly discriminative subsequences. Pattern sampling has been proposed as an effective alternative to mitigate the pattern explosion phenomenon. Therefore, we employ pattern sampling to extract discriminative features from discretized time series data. A weighted trie is created based on the discretized time series data to sample highly discriminative patterns. These sampled patterns are used to identify the shapelets which are used to transform the time series classification problem into a feature-based classification problem. Finally, a classification model can be trained using any off-the-shelf algorithm. Creating a pattern sampler requires a small number of patterns to be evaluated compared to an exhaustive search as employed by previous approaches. Compared to previously proposed algorithms, our approach requires considerably less computational and memory resources. Experiments demonstrate how the proposed approach fares in terms of classification accuracy and runtime performance.
翻译:基于后序列的时间序列分类算法提供了准确和可解释的模型,但对这些模型的培训是极其密集的计算。基于子序列的算法没有时间的复杂性仍然是一个更高层次的多元性,因为这些算法的基础是对高度歧视的子序列序列进行彻底搜索。提出了模式抽样作为减轻模式爆炸现象的有效替代方法。因此,我们使用模式抽样从离散的时间序列数据中提取歧视特征。根据离散的时间序列数据创建了一个加权三角形,以样本高度歧视模式。这些抽样模式用来确定用来将时间序列分类问题转化为基于特征分类问题的形状块。最后,可以使用任何现成的算法对分类模型进行培训。创建模式取样器需要少数模式来评估,而与以往方法采用的详尽搜索相比,与以往使用的模式相比较,我们的方法需要大大减少计算和记忆资源。实验表明所提议的方法在分类准确性和运行时效方面有多远。