使用隐私智能代理器获取最佳数据 (Optimal Data Acquisition with Privacy-Aware Agents)

We study the problem faced by a data analyst or platform that wishes to collect private data from privacy-aware agents. To incentivize participation, in exchange for this data, the platform provides a service to the agents in the form of a statistic computed using all agents' submitted data. The agents decide whether to join the platform (and truthfully reveal their data) or not participate by considering both the privacy costs of joining and the benefit they get from obtaining the statistic. The platform must ensure the statistic is computed differentially privately and chooses a central level of noise to add to the computation, but can also induce personalized privacy levels (or costs) by giving different weights to different agents in the computation as a function of their heterogeneous privacy preferences (which are known to the platform). We assume the platform aims to optimize the accuracy of the statistic, and must pick the privacy level of each agent to trade-off between i) incentivizing more participation and ii) adding less noise to the estimate. We provide a semi-closed form characterization of the optimal choice of agent weights for the platform in two variants of our model. In both of these models, we identify a common nontrivial structure in the platform's optimal solution: an instance-specific number of agents with the least stringent privacy requirements are pooled together and given the same weight, while the weights of the remaining agents decrease as a function of the strength of their privacy requirement. We also provide algorithmic results on how to find the optimal value of the noise parameter used by the platform and of the weights given to the agents.

翻译：我们研究数据分析师或希望从隐私觉察剂收集私人数据平台所面临的问题。为了鼓励参与,以交换这些数据,平台以使用所有代理商提交的数据计算统计数据的形式向代理商提供服务。代理商决定是加入平台(并真实地披露其数据),还是不参与,同时考虑加入的隐私成本以及他们从获得统计数据中获得的好处。平台必须确保统计数字以不同方式私下计算,并选择中心等级的噪音来增加计算,但通过不同代理商在计算中的不同权重给予不同的权重(或费用),该平台为不同的代理商提供了一种服务。我们假定平台的目标是优化统计的准确性,必须从每个代理商的隐私水平到交易(一)鼓励更多的参与,并减少估计数的噪音。我们提供了一种半封闭的形式,即为平台的两种变式提供最佳选择权重(或费用),作为不同隐私偏好功能的函数(平台上已知的权重),我们假设平台的目标是优化统计数据的准确性,同时确定一个共同权重(一)结构的精确度,其最弱的权重由特定代理商的权重(二)的权重(我们确定)的权重)的权重,同时确定一个共同权重的权重的权重的权重的权重的权重(我们使用)比的权重)的权重(和最权重)的权重(我们使用的权重)的权重的权重)的权重)的权重的权重)的权重(我们的权重)的权重(我们通过模型的权重)的权重)的权责)的权责)的权责)的权重(我们确定一个共同的权责)的权责)的权责(我们用方的权责)的权责(我们用的权责)的权责)的权责)的权责(我们的权责)的权责)的权责(我们用比(我们用比(我们用的权责)的权责)的权责)的权责(我们用的权责)的权责)的权责)的权责)的权责(我们用的权责)的权责(我们用的权责)的权责)的权责(我们用的权责)的权责)的权责(我们的权责)的权责