In this paper, we propose a practically efficient model for securely computing rank-based statistics, e.g., median, percentiles and quartiles, over distributed datasets in the malicious setting without leaking individual data privacy. Based on the binary search technique of Aggarwal et al. (EUROCRYPT \textquotesingle 04), we respectively present an interactive protocol and a non-interactive protocol, involving at most $\log ||R||$ rounds, where $||R||$ is the range size of the dataset elements. Besides, we introduce a series of optimisation techniques to reduce the round complexity. Our computing model is modular and can be instantiated with either homomorphic encryption or secret-sharing schemes. Compared to the state-of-the-art solutions, it provides stronger security and privacy while maintaining high efficiency and accuracy. Unlike differential-privacy-based solutions, it does not suffer a trade-off between accuracy and privacy. On the other hand, it only involves $O(N \log ||R||)$ time complexity, which is far more efficient than those bitwise-comparison-based solutions with $O(N^2\log ||R||)$ time complexity, where $N$ is the dataset size. Finally, we provide a UC-secure instantiation with the threshold Paillier cryptosystem and $\Sigma$-protocol zero-knowledge proofs of knowledge.
翻译:在本文中,我们提出了一个在不泄露个人数据隐私的情况下对恶意环境中分布的数据集进行安全计算等级统计(例如中位数、百分位数和四分位数)的实用高效模型。根据Aggarwal等人的二进制搜索技术(EROCRYPT\ text quoteingle 04),我们分别提出了一个互动协议和非互动协议,最多涉及$\log ⁇ R $,其中美元是数据集的范围大小。此外,我们采用了一系列优化技术来降低圆形复杂性。我们的计算模型是模块化的,可以与同变加密或秘密共享计划同时进行即时化。与最先进的解决方案相比,它提供了更强的安全和隐私,同时保持了更高的效率和准确性。与基于差异的解决方案不同,它不会在精确性和隐私之间造成交易。另一方面,它只涉及$O(N) logR $(美元) 和基于时间精度(美元) (美元) 的精确性解决方案,它比这些精确性标准复杂度高得多。