Outsourcing anomaly detection to third-parties can allow data owners to overcome resource constraints (e.g., in lightweight IoT devices), facilitate collaborative analysis (e.g., under distributed or multi-party scenarios), and benefit from lower costs and specialized expertise (e.g., of Managed Security Service Providers). Despite such benefits, a data owner may feel reluctant to outsource anomaly detection without sufficient privacy protection. To that end, most existing privacy solutions would face a novel challenge, i.e., preserving privacy usually requires the difference between data entries to be eliminated or reduced, whereas anomaly detection critically depends on that difference. Such a conflict is recently resolved under a local analysis setting with trusted analysts (where no outsourcing is involved) through moving the focus of differential privacy (DP) guarantee from "all" to only "benign" entries. In this paper, we observe that such an approach is not directly applicable to the outsourcing setting, because data owners do not know which entries are "benign" prior to outsourcing, and hence cannot selectively apply DP on data entries. Therefore, we propose a novel iterative solution for the data owner to gradually "disentangle" the anomalous entries from the benign ones such that the third-party analyst can produce accurate anomaly results with sufficient DP guarantee. We design and implement our Differentially Private Outsourcing of Anomaly Detection (DPOAD) framework, and demonstrate its benefits over baseline Laplace and PainFree mechanisms through experiments with real data from different application domains.
翻译:向第三方外包异常点检测可使数据所有者能够克服资源限制(例如,轻量IoT装置中的数据输入差异),便利合作分析(例如,在分布式或多方假设下),并受益于较低的成本和专业知识(例如,管理式安保服务提供商)。尽管如此,数据所有者可能感到不愿意将异常点检测外包给第三方,而不提供足够的隐私保护。为此,大多数现有的隐私解决方案将面临新的挑战,即维护隐私通常要求取消或减少数据条目之间的差异,而异常点检测则关键取决于这一差异。这种冲突最近通过将差异隐私的焦点(DP)保障从“所有”转移到“基本”条目。 在本文件中,我们观察到,这种做法并不直接适用于外包环境,因为数据所有者不知道哪些条目在外包前是“基本”条目,因此无法有选择地在数据条目中应用DP。因此,我们建议对数据所有者采取新的迭代式解决方案,在本地分析环境中逐渐“不稳定”应用(在不涉及外包的情况下,在不透明地应用“基本”应用“基本”数据库中,我们可以通过“透明”地将数据分析结果从“透明”进行。