The applicability of process mining techniques hinges on the availability of event logs capturing the execution of a business process. In some use cases, particularly those involving customer-facing processes, these event logs may contain private information. Data protection regulations restrict the use of such event logs for analysis purposes. One way of circumventing these restrictions is to anonymize the event log to the extent that no individual can be singled out using the anonymized log. This paper addresses the problem of anonymizing an event log in order to guarantee that, upon disclosure of the anonymized log, the probability that an attacker may single out any individual represented in the original log, does not increase by more than a threshold. The paper proposes a differentially private disclosure mechanism, which oversamples the cases in the log and adds noise to the timestamps to the extent required to achieve the above privacy guarantee. The paper reports on an empirical evaluation of the proposed approach using 14 real-life event logs in terms of data utility loss and computational efficiency.
翻译:工序采矿技术的适用性取决于是否有记录记录记录来记录执行商业过程的情况。在某些使用案例中,特别是涉及客户关注过程的事件,这些事件记录可能包含私人信息。数据保护条例限制为分析目的使用这种事件记录。绕开这些限制的一个办法是,在使用匿名日志时将事件记录匿名化,使没有个人无法被单独点出。本文件论述将事件记录匿名化的问题,以确保在披露匿名日志时,攻击者可能点出原始日志中代表的任何个人的可能性不会增加超过阈值。本文提议了一种差别化的私人披露机制,该机制将记录中的案件标出过多,并在时间戳上增加噪音,达到上述隐私保障要求的程度。本文件报告了在数据效用损失和计算效率方面使用14个真实事件记录对拟议方法进行实证评价的情况。