We give the first approximate-near-neighbor data structures for time series under the continuous Fr\'echet distance. For an attainable approximation factor $c>1$, the data structure can be used to preprocess $n$ curves in $\mathbb{R}$ (aka time series), each of complexity $m$, to answer queries with a curve of complexity $k$ by either returning a curve that lies within Fr\'echet distance $c$, or answering that there exists no curve in the input within distance $1$. In both cases, the answer is correct. Our first data structure achieves a $(5+\epsilon)$ approximation factor, uses space in $n\cdot \mathcal{O}\left({\epsilon^{-1}}\right)^{k} + \mathcal{O}(nm)$ and has query time in $\mathcal{O}\left(k\right)$. Our second data structure achieves a $(2+\epsilon)$ approximation factor, uses space in $n\cdot \mathcal{O}\left(\frac{m}{k\epsilon}\right)^{k} + \mathcal{O}(nm)$ and has query time in $\mathcal{O}\left(k\cdot 2^k\right)$. As an indicator of hardness, we show that for any data structure which achieves an approximation factor less than $2$ and which supports curves of arclength at most $L$, uses a word size bounded by $\mathcal{O}(L^{1-\epsilon})$ for some constant $\epsilon>0$, and answers the query using only a constant number of probes, the number of words used to store the data structure must be at least $L^{\Omega(k)}$. Our third positive result is a probabilistic data structure based on locality-sensitive hashing, which achieves space in $\mathcal{O}(nm)$ and query time in $\mathcal{O}(k)$, and which answers queries with an approximation factor in $\mathcal{O}(k)$. All of our data structures make use of the concept of signatures, which were originally introduced for the problem of clustering time series under the Fr\'echet distance.
翻译:在连续的 Fr\ echet 距离下,我们给出时间序列的第一个近距离数据结构。 对于可以实现的近距离系数 $c>1 美元,数据结构可用于预处理$n美元曲线 $mathb{R} 美元(一个复杂时间序列), 以复杂曲线 $k$回答查询, 以返回位于 Fr\ echet 距离范围内的曲线 $, 或者回答输入在距离内没有曲线 $。 在两种情况下,答案是正确的。 我们的第一个数据结构实现了$( 5\ eepsilon) 的近距离系数 $; 以 $n\ cdor\\\\ r\ k 美元( 一个复杂时间序列) +\ mathal{Or] 美元(一个最小时间值的曲线值 =%crock), 而对于我们的任何数据结构, 以 $( \\\\\\\\\ ma\ ma\ ma\ a 时间值的数值, 数字, 也以Oxxn=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx