Data series motif discovery represents one of the most useful primitives for data series mining, with applications to many domains, such as robotics, entomology, seismology, medicine, and climatology, and others. The state-of-the-art motif discovery tools still require the user to provide the motif length. Yet, in several cases, the choice of motif length is critical for their detection. Unfortunately, the obvious brute-force solution, which tests all lengths within a given range, is computationally untenable, and does not provide any support for ranking motifs at different resolutions (i.e., lengths). We demonstrate VALMOD, our scalable motif discovery algorithm that efficiently finds all motifs in a given range of lengths, and outputs a length-invariant ranking of motifs. Furthermore, we support the analysis process by means of a newly proposed meta-data structure that helps the user to select the most promising pattern length. This demo aims at illustrating in detail the steps of the proposed approach, showcasing how our algorithm and corresponding graphical insights enable users to efficiently identify the correct motifs. (Paper published in ACM Sigmod Conference 2018.)
翻译:数据序列 motif 发现是数据序列挖掘最有用的原始数据, 包括许多领域的应用, 如机器人、 昆虫学、 地震学、 医学、 气候学等。 最先进的 motif 发现工具仍然要求用户提供 motif 长度。 然而, 在某些情况下, 选择 motif 长度对于探测它们至关重要 。 不幸的是, 测试特定范围内所有长度的显而易见的布鲁特力解决方案在计算上是站不住脚的, 并且不支持在不同分辨率( 即长度) 上对motif 进行排序。 我们展示了 VALMOD, 我们的可缩放的motif 发现算法, 有效地在给定的长度范围内找到所有motif, 输出了 mostif 的长不变顺序 。 此外, 我们支持分析过程, 其手段是新提议的元数据结构, 帮助用户选择最有希望的模式长度。 本次演示旨在详细描述拟议方法的步骤, 展示我们所出版的变算法和对应的图像分析器用户如何有效识别 。