Trajectory similarity computation is a fundamental component in a variety of real-world applications, such as ridesharing, road planning, and transportation optimization. Recent advances in mobile devices have enabled an unprecedented increase in the amount of available trajectory data such that efficient query processing can no longer be supported by a single machine. As a result, means of performing distributed in-memory trajectory similarity search are called for. However, existing distributed proposals suffer from either computing resource waste or are unable to support the range of similarity measures that are being used. We propose a distributed in-memory management framework called REPOSE for processing top-k trajectory similarity queries on Spark. We develop a reference point trie (RP-Trie) index to organize trajectory data for local search. In addition, we design a novel heterogeneous global partitioning strategy to eliminate load imbalance in distributed settings. We report on extensive experiments with real-world data that offer insight into the performance of the solution, and show that the solution is capable of outperforming the state-of-the-art proposals.
翻译:轨迹相似性计算是各种现实应用中的一个基本组成部分,如搭便车、道路规划、交通优化等。移动设备最近的进展使得现有轨迹数据的数量空前增加,以致无法再用一台机器支持高效的查询处理。因此,需要采用在模拟轨迹相似性搜索中进行分布式搜索的方法。然而,现有的分布式建议要么受到计算资源浪费的影响,要么无法支持正在使用的一系列类似措施。我们建议采用名为REPOSE(REPOSE)的分布式模块管理框架,用于处理在斯帕克(Spark)的顶级轨迹相似性查询。我们开发了一个参考点Trie(RP-Traie)指数,用于组织用于本地搜索的轨迹数据。此外,我们设计了一种新型的多元全球分割战略,以消除分布式环境中的负载不平衡。我们报告与真实世界数据进行的广泛实验,这些实验有助于了解解决方案的绩效,并表明解决方案能够超过最新提案。