Increasing the success rate of a process, i.e. the percentage of cases that end in a positive outcome, is a recurrent process improvement goal. At runtime, there are often certain actions (a.k.a. treatments) that workers may execute to lift the probability that a case ends in a positive outcome. For example, in a loan origination process, a possible treatment is to issue multiple loan offers to increase the probability that the customer takes a loan. Each treatment has a cost. Thus, when defining policies for prescribing treatments to cases, managers need to consider the net gain of the treatments. Also, the effect of a treatment varies over time: treating a case earlier may be more effective than later in a case. This paper presents a prescriptive monitoring method that automates this decision-making task. The method combines causal inference and reinforcement learning to learn treatment policies that maximize the net gain. The method leverages a conformal prediction technique to speed up the convergence of the reinforcement learning mechanism by separating cases that are likely to end up in a positive or negative outcome, from uncertain cases. An evaluation on two real-life datasets shows that the proposed method outperforms a state-of-the-art baseline.
翻译:提高一个过程的成功率,即以正结果结束的病例的百分比,是一个经常出现的程序改进目标。在运行时,工人往往可以采取某些行动(a.k.a.处理方法)来提高案件结束的概率;例如,在贷款来源过程中,一种可能的处理办法是发出多种贷款提议以增加客户获得贷款的概率。每种治疗都有成本。因此,在确定为个案提供治疗的政策时,管理人员需要考虑治疗的净收益。此外,治疗的效果也因时间而异:在个案中,早处理案件可能比晚处理更有效。本文介绍了一种规范性监测方法,使这一决策任务自动化。这种方法结合了因果因素的推论和强化学习学习治疗政策,以最大限度地实现净收益。这种方法利用一种一致的预测技术,通过从不确定的个案中分离出可能以正或负结果结束的个案,加快强化学习机制的趋同速度。对两种实际数据集的评价显示,拟议的方法超越了基准状态。</s>