This paper investigates the problem of collecting multidimensional data throughout time (i.e., longitudinal studies) for the fundamental task of frequency estimation under Local Differential Privacy (LDP) guarantees. Contrary to frequency estimation of a single attribute, the multidimensional aspect demands particular attention to the privacy budget. Besides, when collecting user statistics longitudinally, privacy progressively degrades. Indeed, the ``multiple" settings in combination (i.e., many attributes and several collections throughout time) impose several challenges, for which this paper proposes the first solution for frequency estimates under LDP. To tackle these issues, we extend the analysis of three state-of-the-art LDP protocols (Generalized Randomized Response -- GRR, Optimized Unary Encoding -- OUE, and Symmetric Unary Encoding -- SUE) for both longitudinal and multidimensional data collections. While the known literature uses OUE and SUE for two rounds of sanitization (a.k.a. memoization), i.e., L-OUE and L-SUE, respectively, we analytically and experimentally show that starting with OUE and then with SUE provides higher data utility (i.e., L-OSUE). Also, for attributes with small domain sizes, we propose Longitudinal GRR (L-GRR), which provides higher utility than the other protocols based on unary encoding. Last, we also propose a new solution named Adaptive LDP for LOngitudinal and Multidimensional FREquency Estimates (ALLOMFREE), which randomly samples a single attribute to be sent with the whole privacy budget and adaptively selects the optimal protocol, i.e., either L-GRR or L-OSUE. As shown in the results, ALLOMFREE consistently and considerably outperforms the state-of-the-art L-SUE and L-OUE protocols in the quality of the frequency estimates.
翻译:本文调查了长期收集多维数据的问题( 纵向研究 ), 在本地差异隐私( LDP) 保障下对频率估算的基本任务收集多维数据的问题。 与对单个属性的频率估计相反, 多维方面要求特别关注隐私预算。 此外, 在收集用户统计数据时, 隐私在纵向逐渐降低。 事实上, “ 多重” 设置的结合( 即, 许多属性和多个收集 ) 带来了若干挑战, 本文为此提出了LDP 下对频率估算的第一个解决方案。 为了解决这些问题, 我们扩展了对三种最高级LDP协议的分析( 通用的随机反应 -- -- GRR, 优化的 Unational编码 -- OUME, 和 Symetrial Unal Incoding -- SUE ) 。 虽然已知的文献在两次清算( a.k.a. refricional) 时使用OUE 和 SUER, 也用S- real- deal- develrial O.