Sharing trajectories is beneficial for many real-world applications, such as managing disease spread through contact tracing and tailoring public services to a population's travel patterns. However, public concern over privacy and data protection has limited the extent to which this data is shared. Local differential privacy enables data sharing in which users share a perturbed version of their data, but existing mechanisms fail to incorporate user-independent public knowledge (e.g., business locations and opening times, public transport schedules, geo-located tweets). This limitation makes mechanisms too restrictive, gives unrealistic outputs, and ultimately leads to low practical utility. To address these concerns, we propose a local differentially private mechanism that is based on perturbing hierarchically-structured, overlapping $n$-grams (i.e., contiguous subsequences of length $n$) of trajectory data. Our mechanism uses a multi-dimensional hierarchy over publicly available external knowledge of real-world places of interest to improve the realism and utility of the perturbed, shared trajectories. Importantly, including real-world public data does not negatively affect privacy or efficiency. Our experiments, using real-world data and a range of queries, each with real-world application analogues, demonstrate the superiority of our approach over a range of alternative methods.
翻译:分享轨迹有益于许多现实世界应用,例如管理疾病通过接触追踪传播,使公共服务适应人口旅行模式,但公众对于隐私和数据保护的关切限制了这些数据的共享程度; 地方差异隐私使用户能够分享数据共享,用户可以分享其数据扰动版本,但现有机制没有纳入独立用户的公众知识(例如,商业地点和开放时间、公共交通时间表、地理定位推文)。这一限制使机制过于严格,提供了不切实际的产出,最终导致实际效用低。为了解决这些关切,我们提议了一种地方差异化的私人机制,其基础是轨道数据的分层结构、重叠的美元-克(即相连次序列,以美元计长度)。我们的机制利用对现实世界感兴趣地点公开的外部知识的多层面等级来改进周遭、共享的轨迹的实际情况和效用。重要的是,包括现实世界公共数据在内的本地公共数据并不对隐私或效率产生消极影响。我们用各种模拟方法对真实世界进行实验,并用各种模拟方法对真实世界的优越性进行各种查询。