Publishing streaming data in a privacy-preserving manner has been a key research focus for many years. This issue presents considerable challenges, particularly due to the correlations prevalent within the data stream. Existing approaches either fall short in effectively leveraging these correlations, leading to a suboptimal utility-privacy tradeoff, or they involve complex mechanism designs that increase the computation complexity with respect to the sequence length. In this paper, we introduce Sequence Information Privacy (SIP), a new privacy notion designed to guarantee privacy for an entire data stream, taking into account the intrinsic data correlations. We show that SIP provides a similar level of privacy guarantee compared to local differential privacy (LDP), and it also enjoys a lightweight modular mechanism design. We further study two online data release models (instantaneous or batched) and propose corresponding privacy-preserving data perturbation mechanisms. We provide a numerical evaluation of how correlations influence noise addition in data streams. Lastly, we conduct experiments using real-world data to compare the utility-privacy tradeoff offered by our approaches with those from existing literature. The results reveal that our mechanisms offer utility improvements more than twice those based on LDP-based mechanisms.
翻译:暂无翻译