Modern time series analysis requires the ability to handle datasets that are inherently high-dimensional; examples include applications in climatology, where measurements from numerous sensors must be taken into account, or inventory tracking of large shops, where the dimension is defined by the number of tracked items. The standard way to mitigate computational issues arising from the high dimensionality of the data is by applying some dimension reduction technique that preserves the structural properties of the ambient space. The dissimilarity between two time series is often measured by ``discrete'' notions of distance, e.g. the dynamic time warping or the discrete Fr\'echet distance. Since all these distance functions are computed directly on the points of a time series, they are sensitive to different sampling rates or gaps. The continuous Fr\'echet distance offers a popular alternative which aims to alleviate this by taking into account all points on the polygonal curve obtained by linearly interpolating between any two consecutive points in a sequence. We study the ability of random projections \`a la Johnson and Lindenstrauss to preserve the continuous Fr\'echet distance of polygonal curves by effectively reducing the dimension. In particular, we show that one can reduce the dimension to $O(\epsilon^{-2} \log N)$, where $N$ is the total number of input points while preserving the continuous Fr\'echet distance between any two determined polygonal curves within a factor of $1\pm \epsilon$. We conclude with applications on clustering.
翻译:现代时间序列分析要求有能力处理本质上是高维的数据集; 示例包括气候学应用,其中必须考虑到来自众多传感器的测量结果,或大型商店的库存跟踪,其尺寸由跟踪的物品数量决定。 减轻数据高度维度产生的计算问题的标准方法是应用某种维度减少技术,以维护环境空间的结构特性。 两个时间序列之间的差异往往用“分解”的距离概念来衡量,例如动态时间扭曲或离散的Fr\'echet距离。由于所有这些距离功能都是在时间序列点上直接计算,因此对不同的采样率或差距十分敏感。 持续的Fr\'echet距离提供了一种流行的替代方法,目的是通过在序列中任何两个连续点之间的线性插点进行缓解。 我们研究随机预测( ⁇ a la Johnson 和 Lindenstraus 以保持连续的Fréchel $ 的距离的能力, 而我们可以有效地缩小一个持续硬度的硬度范围, 从而缩小一个持续的硬度值值值 。