Various methods using machine and deep learning have been proposed to tackle different tasks in predictive process monitoring, forecasting for an ongoing case e.g. the most likely next event or suffix, its remaining time, or an outcome-related variable. Recurrent neural networks (RNNs), and more specifically long short-term memory nets (LSTMs), stand out in terms of popularity. In this work, we investigate the capabilities of such an LSTM to actually learn the underlying process model structure of an event log. We introduce an evaluation framework that combines variant-based resampling and custom metrics for fitness, precision and generalization. We evaluate 4 hypotheses concerning the learning capabilities of LSTMs, the effect of overfitting countermeasures, the level of incompleteness in the training set and the level of parallelism in the underlying process model. We confirm that LSTMs can struggle to learn process model structure, even with simplistic process data and in a very lenient setup. Taking the correct anti-overfitting measures can alleviate the problem. However, these measures did not present themselves to be optimal when selecting hyperparameters purely on predicting accuracy. We also found that decreasing the amount of information seen by the LSTM during training, causes a sharp drop in generalization and precision scores. In our experiments, we could not identify a relationship between the extent of parallelism in the model and the generalization capability, but they do indicate that the process' complexity might have impact.
翻译:利用机器和深层次学习的方法已经提出,以便处理预测过程监测、预测当前案件的不同任务,例如,最有可能发生的下一个事件或后继、剩余时间或与结果有关的变数。经常神经网络(RNNS),更具体地说,短期记忆网(LSTMs),在受欢迎程度方面最为突出。我们调查了这样一个LSTM系统的能力,以实际学习事件日志的基本过程模型结构。我们引入了一个评价框架,将基于变式的抽样和定制的衡量标准结合起来,以适应、精确和概括性为目的。我们评估了四个假设,这些假设涉及LSTMS的学习能力、过度适应反措施的效果、培训设置的不完善程度以及基础进程模型的平行程度。我们确认,即使使用简化的过程数据,而且设置得非常宽松,LSTM系统也能努力学习过程的模式结构。采用正确的反过度措施可以缓解问题。但是,这些措施本身在选择纯粹预测精确性的超分数时并不是最理想的。我们还发现,在总体的实验中,我们所看到的信息的精确程度可能会减少。