Recent papers have introduced a novel approach to explain why a Predictive Process Monitoring (PPM) model for outcome-oriented predictions provides wrong predictions. Moreover, they have shown how to exploit the explanations, obtained using state-of-the art post-hoc explainers, to identify the most common features that induce a predictor to make mistakes in a semi-automated way, and, in turn, to reduce the impact of those features and increase the accuracy of the predictive model. This work starts from the assumption that frequent control flow patterns in event logs may represent important features that characterize, and therefore explain, a certain prediction. Therefore, in this paper, we (i) employ a novel encoding able to leverage DECLARE constraints in Predictive Process Monitoring and compare the effectiveness of this encoding with Predictive Process Monitoring state-of-the art encodings, in particular for the task of outcome-oriented predictions; (ii) introduce a completely automated pipeline for the identification of the most common features inducing a predictor to make mistakes; and (iii) show the effectiveness of the proposed pipeline in increasing the accuracy of the predictive model by validating it on different real-life datasets.
翻译:解释、适应和重新训练:如何通过不同的解释方式提高PPM分类器的准确性
翻译后的摘要:
最近的研究已经引入了一种新方法,能够解释为什么面向结果的预测模型(PPM)提供了错误的预测,并展示了如何利用最先进的后续解释器获得解释,以半自动的方式识别最常见的特征,从而降低这些特征的影响,提高预测模型的准确性。本文假定事件日志中的常见控制流模式可能代表了表征某个预测的重要特征,并因此进行以下工作:(i)使用一种新的编码能够利用DECLARE约束在PPM中进行编码,并比较这种编码与最先进的PPM编码的有效性,特别是针对面向结果的预测任务;(ii)引入完全自动化的流水线,用于识别最常见的诱导预测器出错的特征;(iii)通过在不同的真实数据集上进行验证,展示所提出的流水线提高预测模型准确性的有效性。