Process mining is a relatively new subject which builds a bridge between traditional process modelling and data mining. Process discovery is one of the most critical parts of process mining which aims at discovering process models automatically from event logs. The performance of existing process discovery algorithms can be affected when there are missing activity labels in event logs. Several methods have been proposed to repair missing activity labels, but their accuracy can drop when a large number of activity labels are missing. In this paper, we propose a LSTM-based prediction model to predict the missing activity labels in event logs. The proposed model takes both the prefix and suffix sequences of the events with missing activity labels as input. Additional attributes of event logs are also utilised to improve the performance. Our evaluation on several publicly available datasets show that the proposed method performed consistently better than existing methods to repair missing activity labels in event logs.
翻译:工艺采矿是一个相对较新的课题,在传统的工艺建模和数据开采之间架起桥梁。工艺发现是工艺采矿中最关键的部分之一,目的是从事件日志中自动发现过程模型。当事件日志中缺少活动标签时,现有的工艺发现算法的性能会受到影响。建议采用几种方法来修复缺失的活动标签,但是当大量活动标签缺失时,其准确性会下降。在本文中,我们提出了一个基于 LSTM 的预测模型,以预测事件日志中缺失的活动标签。拟议的模型将缺少活动标签的事件的前缀和后缀序列作为投入。还利用了事件日志的额外属性来改进绩效。我们对几个公开存在的数据集的评估表明,拟议的方法比在事件日志中修复缺失活动标签的现有方法一致地好。