Firms and statistical agencies must protect the privacy of the individuals whose data they collect, analyze, and publish. Increasingly, these organizations do so by using publication mechanisms that satisfy differential privacy. We consider the problem of choosing such a mechanism so as to maximize the value of its output to end users. We show that mechanisms which add noise to the statistic of interest--like most of those used in practice--are generally not optimal when the statistic is a sum or average of magnitude data (e.g., income). However, we also show that adding noise is always optimal when the statistic is a count of data entries with a certain characteristic, and the underlying database is drawn from a symmetric distribution (e.g., if individuals' data are i.i.d.). When, in addition, data users have supermodular payoffs, we show that the simple geometric mechanism is always optimal by using a novel comparative static that ranks information structures according to their usefulness in supermodular decision problems.
翻译:暂无翻译