Predictive process analytics often apply machine learning to predict the future states of a running business process. However, the internal mechanisms of many existing predictive algorithms are opaque and a human decision-maker is unable to understand \emph{why} a certain activity was predicted. Recently, counterfactuals have been proposed in the literature to derive human-understandable explanations from predictive models. Current counterfactual approaches consist of finding the minimum feature change that can make a certain prediction flip its outcome. Although many algorithms have been proposed, their application to the sequence and multi-dimensional data like event logs has not been explored in the literature. In this paper, we explore the use of a recent, popular model-agnostic counterfactual algorithm, DiCE, in the context of predictive process analytics. The analysis reveals that the algorithm is limited when being applied to derive explanations of process predictions, due to (1) process domain knowledge not being taken into account, (2) long traces that often tend to be less understandable, and (3) difficulties in optimising the counterfactual search with categorical variables. We design an extension of DiCE that can generate counterfactuals for process predictions, and propose an approach that supports deriving milestone-aware counterfactuals at different stages of a trace to promote interpretability. We apply our approach to BPIC2012 event log and the analysis results demonstrate the effectiveness of the proposed approach.
翻译:预测过程分析往往应用机器学习来预测运行中业务流程的未来状态。然而,许多现有预测算法的内部机制不透明,而且人类决策者无法理解某种活动。最近,文献中提出了反事实,以从预测模型中得出人类无法理解的解释。目前的反事实方法包括找到最起码的特征变化,从而可以使某些预测翻转其结果。虽然已经提出了许多算法,但在文献中并没有探讨这些算法对序列和多维数据的应用。在本文件中,我们探索了在预测过程分析中使用最新的流行模型-不可知反事实算法DICE。分析显示,当应用算法来解释过程预测时,算法是有限的,原因是:(1) 过程域知识没有被考虑,(2) 长期的痕迹往往不太容易理解,(3) 在选择反事实搜索时,如事件日志等,没有在文献中加以探讨。我们设计DICE的扩展方法,在预测过程分析过程中可以产生反现实预测结果,我们提出了对结果的分析,我们提出了对结果的追溯性分析。