Current practices for reporting the level of differential privacy (DP) protection for machine learning (ML) algorithms such as DP-SGD provide an incomplete and potentially misleading picture of the privacy guarantees. For instance, if only a single $(\varepsilon,\delta)$ is known about a mechanism, standard analyses show that there exist highly accurate inference attacks against training data records, when, in fact, such accurate attacks might not exist. In this position paper, we argue that using non-asymptotic Gaussian Differential Privacy (GDP) as the primary means of communicating DP guarantees in ML avoids these potential downsides. Using two recent developments in the DP literature: (i) open-source numerical accountants capable of computing the privacy profile and $f$-DP curves of DP-SGD to arbitrary accuracy, and (ii) a decision-theoretic metric over DP representations, we show how to provide non-asymptotic bounds on GDP using numerical accountants, and show that GDP can capture the entire privacy profile of DP-SGD and related algorithms with virtually no error, as quantified by the metric. To support our claims, we investigate the privacy profiles of state-of-the-art DP large-scale image classification, and the TopDown algorithm for the U.S. Decennial Census, observing that GDP fits their profiles remarkably well in all cases. We conclude with a discussion on the strengths and weaknesses of this approach, and discuss which other privacy mechanisms could benefit from GDP.
翻译:暂无翻译