When it comes to location-based services (LBS), user privacy protection can be in conflict with security of both users and trips. While LBS providers could adopt privacy preservation mechanisms to obfuscate customer data, the accuracy of vehicle location data and trajectories is crucial for detecting anomalies, especially when machine learning methods are adopted by LBS. This paper aims to tackle this dilemma by evaluating the tradeoff between location privacy and security in LBS. In particular, we investigate the impact of applying location data privacy-preservation techniques on the performance of two detectors, namely a Density-based spatial clustering of applications with noise (DBSCAN), and a Recurrent Neural Network (RNN). The experimental results suggest that, by applying privacy on location data, DBSCAN is more sensitive to Laplace noise than RNN, although they achieve similar detection accuracy on the trip data without privacy preservation. Further experiments reveal that DBSCAN is not scalable to large size datasets containing millions of trips, because of the large number of computations needed for clustering trips. On the other hand, DBSCAN only requires less than 10 percent of the data used by RNN to achieve similar performance when applied to vehicle data without obfuscation, demonstrating that clustering-based methods can be easily applied to small datasets. Based on the results, we recommend usage scenarios of the two types of trajectory anomaly detectors when applying privacy preservation, by taking into account customers' need for privacy, the size of the available vehicle trip data, and real-time constraints of the LBS application.
翻译:在基于地点的服务(LBS)方面,用户隐私保护可能与用户和旅行的安全相冲突。虽然LBS供应商可以采用隐私保护机制来混淆客户数据,但车辆定位数据和轨迹的准确性对于发现异常现象至关重要,特别是当LBS采用机器学习方法时。本文的目的是通过评估定位隐私与LBS安全之间的权衡来应对这一困境。特别是,我们调查应用基于地点数据隐私保护技术对两个探测器的性能的影响,即以噪音(DBSCAN)和经常性神经网络(RNN)进行基于密度的空间应用组合。实验结果表明,由于对定位数据使用隐私,DBSCAN对Lplace噪音的准确性比RNNN更敏感。 进一步实验显示,DBSCAN无法与包含数百万次旅行的大型数据集相适应,因为对集群旅行需要进行大量计算,DBSCAN只需要不到10%的基于噪音的应用空间应用比例,而在不采用基于车辆的保密性能上,因此,在不使用基于车辆的保密性轨道上,我们使用数据时,可以将使用不使用不具有类似性数据的频率的频率,因此,因此,我们可以将数据用于用于车辆的频率定位数据库的频率定位数据库数据库应用。