The problem of Approximate Nearest Neighbor (ANN) search is fundamental in computer science and has benefited from significant progress in the past couple of decades. However, most work has been devoted to pointsets whereas complex shapes have not been sufficiently treated. Here, we focus on distance functions between discretized curves in Euclidean space: they appear in a wide range of applications, from road segments to time-series in general dimension. For $\ell_p$-products of Euclidean metrics, for any $p$, we design simple and efficient data structures for ANN, based on randomized projections, which are of independent interest. They serve to solve proximity problems under a notion of distance between discretized curves, which generalizes both discrete Fr\'echet and Dynamic Time Warping distances. These are the most popular and practical approaches to comparing such curves. We offer the first data structures and query algorithms for ANN with arbitrarily good approximation factor, at the expense of increasing space usage and preprocessing time over existing methods. Query time complexity is comparable or significantly improved by our algorithms, our algorithm is especially efficient when the length of the curves is bounded.
翻译:近邻近距离搜索(ANN)问题在计算机科学中具有根本意义,并受益于过去几十年的显著进展。然而,大部分工作都用于点点子,而复杂的形状没有得到充分处理。在这里,我们集中关注在Euclidean空间的离散曲线之间的距离功能:它们出现在从路段到一般时间序列的广泛应用中。对于Euclidean 度量的产物,对于任何美元,我们根据随机化的预测为ANN设计简单而有效的数据结构,这是独立感兴趣的。它们有助于在离散曲线之间的距离概念下解决近距离问题,该概念将离散的Fr\'echet和动态时间旋转距离都加以概括。这是比较这些曲线最受欢迎和最实用的方法。我们为ANN提供了第一种数据结构和查询算法,其任意的近似系数,而牺牲了空间使用的增加和现有方法的预处理时间。根据我们的算法,时间复杂性是可比较或大大改进的,当我们算法的曲线长度是特别有效的。