Due to the massively increasing amount of available geospatial data and the need to present it in an understandable way, clustering this data is more important than ever. As clusters might contain a large number of objects, having a representative for each cluster significantly facilitates understanding a clustering. Clustering methods relying on such representatives are called center-based. In this work we consider the problem of center-based clustering of trajectories. In this setting, the representative of a cluster is again a trajectory. To obtain a compact representation of the clusters and to avoid overfitting, we restrict the complexity of the representative trajectories by a parameter l. This restriction, however, makes discrete distance measures like dynamic time warping (DTW) less suited. There is recent work on center-based clustering of trajectories with a continuous distance measure, namely, the Fr\'echet distance. While the Fr\'echet distance allows for restriction of the center complexity, it can also be sensitive to outliers, whereas averaging-type distance measures, like DTW, are less so. To obtain a trajectory clustering algorithm that allows restricting center complexity and is more robust to outliers, we propose the usage of a continuous version of DTW as distance measure, which we call continuous dynamic time warping (CDTW). Our contribution is twofold: 1. To combat the lack of practical algorithms for CDTW, we develop an approximation algorithm that computes it. 2. We develop the first clustering algorithm under this distance measure and show a practical way to compute a center from a set of trajectories and subsequently iteratively improve it. To obtain insights into the results of clustering under CDTW on practical data, we conduct extensive experiments.
翻译:由于可获得的地理空间数据数量大增,而且需要以可以理解的方式展示这些数据,因此将这些数据分组比以往任何时候更加重要。由于集群可能包含大量物体,因此每个组组的代表性会大大促进理解集群。依赖这些代表的分组方法被称为中心。在这项工作中,我们考虑以中心为基础对轨迹进行分组的问题。在这个环境中,一个组的代表性再次是一个轨迹。为了获得各组群的压缩代表性,并避免过度配置,我们比以往任何时候更需要将代表性轨道的复杂性限制在一个参数上。然而,由于这一限制可能包含大量的物体,因此使动态时间扭曲(DT)等离散的距离措施更不合适。最近关于以中心为基础对轨道进行分组的工作被称为中心基的以中心为基础进行分组,并持续远程测量,即Fr\'echet距离。虽然Fr\'echet 距离可以限制中心的复杂性,但我们也可以对离线的距离敏感,而像DGW这样的平均距离测量方法则比较不那么简单。为了获得一个轨道组合的计算法,在这样的轨迹上可以限制中心的复杂程度,而后,我们又建议用一个连续的距离来计算方法来测量的路径来测量。我们用一个不断的路径来测量的路径来测量。