A user generates n independent and identically distributed data random variables with a probability mass function that must be guarded from a querier. The querier must recover, with a prescribed accuracy, a given function of the data from each of n independent and identically distributed query responses upon eliciting them from the user. The user chooses the data probability mass function and devises the random query responses to maximize distribution privacy as gauged by the (Kullback-Leibler) divergence between the former and the querier's best estimate of it based on the n query responses. Considering an arbitrary function, a basic achievable lower bound for distribution privacy is provided that does not depend on n and corresponds to worst-case privacy. Worst-case privacy equals the logsum cardinalities of inverse atoms under the given function, with the number of summands decreasing as the querier recovers the function with improving accuracy. Next, upper (converse) and lower (achievability) bounds for distribution privacy, dependent on n, are developed. The former improves upon worst-case privacy and the latter does so under suitable assumptions; both converge to it as n grows. The converse and achievability proofs identify explicit strategies for the user and the querier.
翻译:用户生成独立和同样分布的数据随机变量, 其概率质量函数必须从 querier 中加以保护。 querier 必须在规定的准确性下, 恢复从从用户获取独立和相同分布的查询回复后从每个该用户获取的数据中得出的特定功能。 用户选择数据概率质量函数, 并设计随机查询响应, 以尽可能扩大分配隐私, 由( Kullback- Leibel) 和 querier 根据 n 查询回复对分配隐私的最佳估计 所测量的( Kullback- Leibel) 差异来测量。 考虑到任意性功能, 配置隐私的基本可实现程度较低, 且不取决于 n, 且符合最坏的隐私。 最坏的隐私在给定的假设下, 最坏的隐私等于在给定功能下反向的对数基点, 随着 querer 恢复功能时的准确性, 匹配数量会减少。 下一步, 上( 反) 和较低( 可实现) 绑于 n。 。 则取决于 。 在最坏的隐私和最坏的假设下, 在合适的假设下改进最坏的隐私,, 和最接近于用户的验证中, 。