Anonymization of event logs facilitates process mining while protecting sensitive information of process stakeholders. Existing techniques, however, focus on the privatization of the control-flow. Other process perspectives, such as roles, resources, and objects are neglected or subject to randomization, which breaks the dependencies between the perspectives. Hence, existing techniques are not suited for advanced process mining tasks, e.g., social network mining or predictive monitoring. To address this gap, we propose PMDG, a framework to ensure privacy for multi-perspective process mining through data generalization. It provides group-based privacy guarantees for an event log, while preserving the characteristic dependencies between the control-flow and further process perspectives. Unlike existin privatization techniques that rely on data suppression or noise insertion, PMDG adopts data generalization: a technique where the activities and attribute values referenced in events are generalized into more abstract ones, to obtain equivalence classes that are sufficiently large from a privacy point of view. We demonstrate empirically that PMDG outperforms state-of-the-art anonymization techniques, when mining handovers and predicting outcomes.
翻译:匿名化事件日志可以在保护流程相关方的敏感信息的同时,便于流程挖掘。现有的技术侧重于控制流程的隐私,而其他的流程视角,例如角色、资源和对象则被忽略或随机化,从而破坏不同视角之间的依赖关系。因此,现有技术并不适用于高级的流程挖掘任务,例如社交网络挖掘或预测性监控。为了弥补这一空白,我们提出了PMDG,这是一个通过数据概括实现多视角流程挖掘的隐私框架。它提供了基于组的隐私保证,同时保留了控制流和其他流程视角之间的依赖关系特征。与现有的依赖于数据抑制或噪声插入的技术不同,PMDG采用数据概括:一种将事件中引用的活动和属性值概述为更抽象的值,以获得隐私角度足够大的等价类的技术。我们通过实验证明,PMDG在挖掘移交和预测结果时优于最先进的匿名化技术。