With the advancements in connected devices, a huge amount of real-time data is being generated. Efficient storage, transmission, and analysation of this real-time big data is important, as it serves a number of purposes ranging from decision making to fault prediction, etc. Alongside this, real-time big data has rigorous utility and privacy requirements, therefore, it is also significantly important to choose the handling strategies meticulously. One of the optimal way to store and transmit data in the form of lossless compression is Huffman coding, which compresses the data into a variable length binary stream. Similarly, in order to protect the privacy of such big data, differential privacy is being used nowadays, which perturbs the data on the basis of privacy budget and sensitivity. Nevertheless, traditional differential privacy mechanisms provide privacy guarantees. However, on the other hand, real-time data cannot be dealt as an ordinary set of records, because it usually has certain underlying patterns and cycles, which can be used for forming a link to a specific individuals private information that can lead to severe privacy leakages (e.g., analysing smart metering data can lead to classification of individuals daily routine). Thus, it is equally important to develop a privacy preservation model, which preserves the privacy on the basis of occurrences and patterns in the data. In this paper, we design a novel Huff-DP mechanism, which selects the optimal privacy budget on the basis of privacy requirement for that specific record. In order to further enhance the budget determination, we propose static, sine, and fuzzy logic based decision algorithms. From the experimental evaluations, it can be concluded that our proposed Huff-DP mechanism provides effective privacy protection alongside reducing the privacy budget computational cost.
翻译:随着连接装置的进步,大量实时数据正在生成。 高效存储、传输和分析这一实时大数据非常重要,因为它服务于从决策到错误预测等一系列目的。 此外,实时大数据具有严格的效用和隐私要求,因此,同样重要的是要谨慎地选择处理策略。 以无损失压缩形式存储和传输数据的最佳方式之一是Huffman编码,它将数据压缩成一个可变长的双元流。同样,为了保护这类大数据的隐私,现在正在使用不同的保密性,这在隐私预算和敏感度的基础上渗透数据。然而,传统的不同隐私机制提供了严格的隐私保障。但另一方面,实时数据不能作为一套普通的记录处理,因为它通常有某些基本的模式和周期,可以用来与特定个人建立链接,从而导致严重的隐私泄露(例如,分析智能数据,从这种大数据的隐私隐私,现在使用不同的保密性,这在隐私预算和敏感度的基础上渗透数据。 如此重要地在预算结构上,我们通过对隐私的保密性数据进行分类, 并且以新的程序为基础, 。