DPOAD:通过迭代感敏力学习,将异性异性异性异性检测外包给私人 (DPOAD: Differentially Private Outsourcing of Anomaly Detection through Iterative Sensitivity Learning)

Outsourcing anomaly detection to third-parties can allow data owners to overcome resource constraints (e.g., in lightweight IoT devices), facilitate collaborative analysis (e.g., under distributed or multi-party scenarios), and benefit from lower costs and specialized expertise (e.g., of Managed Security Service Providers). Despite such benefits, a data owner may feel reluctant to outsource anomaly detection without sufficient privacy protection. To that end, most existing privacy solutions would face a novel challenge, i.e., preserving privacy usually requires the difference between data entries to be eliminated or reduced, whereas anomaly detection critically depends on that difference. Such a conflict is recently resolved under a local analysis setting with trusted analysts (where no outsourcing is involved) through moving the focus of differential privacy (DP) guarantee from "all" to only "benign" entries. In this paper, we observe that such an approach is not directly applicable to the outsourcing setting, because data owners do not know which entries are "benign" prior to outsourcing, and hence cannot selectively apply DP on data entries. Therefore, we propose a novel iterative solution for the data owner to gradually "disentangle" the anomalous entries from the benign ones such that the third-party analyst can produce accurate anomaly results with sufficient DP guarantee. We design and implement our Differentially Private Outsourcing of Anomaly Detection (DPOAD) framework, and demonstrate its benefits over baseline Laplace and PainFree mechanisms through experiments with real data from different application domains.

翻译：向第三方外包异常点检测可使数据所有者能够克服资源限制(例如,轻量IoT装置中的数据输入差异),便利合作分析(例如,在分布式或多方假设下),并受益于较低的成本和专业知识(例如,管理式安保服务提供商)。尽管如此,数据所有者可能感到不愿意将异常点检测外包给第三方,而不提供足够的隐私保护。为此,大多数现有的隐私解决方案将面临新的挑战,即维护隐私通常要求取消或减少数据条目之间的差异,而异常点检测则关键取决于这一差异。这种冲突最近通过将差异隐私的焦点(DP)保障从“所有”转移到“基本”条目。在本文件中,我们观察到,这种做法并不直接适用于外包环境,因为数据所有者不知道哪些条目在外包前是“基本”条目,因此无法有选择地在数据条目中应用DP。因此,我们建议对数据所有者采取新的迭代式解决方案,在本地分析环境中逐渐“不稳定”应用(在不涉及外包的情况下,在不透明地应用“基本”应用“基本”数据库中,我们可以通过“透明”地将数据分析结果从“透明”进行。

相关内容

异常检测

关注 102

在数据挖掘中，异常检测（英语：anomaly detection）对不符合预期模式或数据集中其他项目的项目、事件或观测值的识别。通常异常项目会转变成银行欺诈、结构缺陷、医疗问题、文本错误等类型的问题。异常也被称为离群值、新奇、噪声、偏差和例外。特别是在检测滥用与网络入侵时，有趣性对象往往不是罕见对象，但却是超出预料的突发活动。这种模式不遵循通常统计定义中把异常点看作是罕见对象，于是许多异常检测方法（特别是无监督的方法）将对此类数据失效，除非进行了合适的聚集。相反，聚类分析算法可能可以检测出这些模式形成的微聚类。有三大类异常检测方法。[1] 在假设数据集中大多数实例都是正常的前提下，无监督异常检测方法能通过寻找与其他数据最不匹配的实例来检测出未标记测试数据的异常。监督式异常检测方法需要一个已经被标记“正常”与“异常”的数据集，并涉及到训练分类器（与许多其他的统计分类问题的关键区别是异常检测的内在不均衡性）。半监督式异常检测方法根据一个给定的正常训练数据集创建一个表示正常行为的模型，然后检测由学习模型生成的测试实例的可能性。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日