The applicability of process mining techniques hinges on the availability of event logs capturing the execution of a business process. In some use cases, particularly those involving customer-facing processes, these event logs may contain private information. Data protection regulations restrict the use of such event logs for analysis purposes. One way of circumventing these restrictions is to anonymize the event log to the extent that no individual can be singled out using the anonymized log. This article addresses the problem of anonymizing an event log in order to guarantee that, upon release of the anonymized log, the probability that an attacker may single out any individual represented in the original log does not increase by more than a threshold. The article proposes a differentially private release mechanism, which samples the cases in the log and adds noise to the timestamps to the extent required to achieve the above privacy guarantee. The article reports on an empirical comparison of the proposed approach against the state-of-the-art approaches using 14 real-life event logs in terms of data utility loss and computational efficiency.
翻译:工艺采矿技术的适用性取决于是否有记录记录记录记录来记录执行商业过程的情况。在某些使用案例中,特别是涉及客户关注过程的情况中,这些记录记录可能包含私人信息。数据保护条例限制为分析目的使用这种事件记录。绕开这些限制的一个办法是,在使用匿名日志时,不得单独将事件记录匿名化,这样就没有人可以使用匿名日志单独点出。这一条涉及对事件记录进行匿名化的问题,以便保证在发布匿名日志时,攻击者可能点出原始日志中代表的任何个人的可能性不会增加超过阈值。文章提出一种差别化的私人释放机制,在日志中对案件进行抽样,并按实现上述隐私保障所需的时间标记增加噪音。文章报告了在数据效用损失和计算效率方面使用14个真实事件日志对拟议方法与最新方法进行实证比较的情况。