Analytical SQL queries are essential for extracting insights from relational databases but concurrently introduce significant privacy risks by potentially exposing sensitive information. To mitigate these risks, numerous query sanitization systems have been developed, employing diverse approaches that create a complex landscape for both researchers and practitioners. These systems vary fundamentally in their design, including the underlying privacy model, such as k-anonymity or Differential Privacy; the protected privacy unit, whether at the tuple- or user-level; and the software architecture, which can be proxy-based or integrated. This paper provides a systematic classification of state-of-the-art SQL sanitization systems based on these qualitative criteria and the scope of queries they support. Furthermore, we present a quantitative analysis of leading systems, empirically measuring the trade-offs between data utility, query execution overhead, and privacy guarantees across a range of analytical queries. This work offers a structured overview and performance assessment intended to clarify the capabilities and limitations of current privacy-preserving database technologies.
翻译:暂无翻译