This paper considers the problem of the private release of sample means of speed values from traffic datasets. Our key contribution is the development of user-level differentially private algorithms that incorporate carefully chosen parameter values to ensure low estimation errors on real-world datasets, while ensuring privacy. We test our algorithms on ITMS (Intelligent Traffic Management System) data from an Indian city, where the speeds of different buses are drawn in a potentially non-i.i.d. manner from an unknown distribution, and where the number of speed samples contributed by different buses is potentially different. We then apply our algorithms to a synthetic dataset, generated based on the ITMS data, having either a large number of users or a large number of samples per user. Here, we provide recommendations for the choices of parameters and algorithm subroutines that result in low estimation errors, while guaranteeing user-level privacy.
翻译:暂无翻译