Traditional process mining techniques take event data as input where each event is associated with exactly one object. An object represents the instantiation of a process. Object-centric event data contain events associated with multiple objects expressing the interaction of multiple processes. As traditional process mining techniques assume events associated with exactly one object, these techniques cannot be applied to object-centric event data. To use traditional process mining techniques, the object-centric event data are flattened by removing all object references but one. The flattening process is lossy, leading to inaccurate features extracted from flattened data. Furthermore, the graph-like structure of object-centric event data is lost when flattening. In this paper, we introduce a general framework for extracting and encoding features from object-centric event data. We calculate features natively on the object-centric event data, leading to accurate measures. Furthermore, we provide three encodings for these features: tabular, sequential, and graph-based. While tabular and sequential encodings have been heavily used in process mining, the graph-based encoding is a new technique preserving the structure of the object-centric event data. We provide six use cases: a visualization and a prediction use case for each of the three encodings. We use explainable AI in the prediction use cases to show the utility of both the object-centric features and the structure of the sequential and graph-based encoding for a predictive model.
翻译:传统开采工艺将事件数据作为输入, 每一个事件都与一个对象完全相关。 一个对象代表一个过程的即时化。 以对象为中心的事件数据包含多个表达多个过程相互作用的物体的相关事件。 由于传统过程采矿技术假定的事件与一个物体完全相关, 这些技术无法应用于以物体为中心的事件数据。 为了使用传统的过程性采矿技术, 以物体为中心的事件数据通过删除所有对象引用来平坦。 平坦过程是丢失的, 导致从平坦数据中提取的不准确特征。 此外, 平坦时, 以物体为中心的事件数据的图表结构会丢失。 在本文中, 我们引入了一个从以物体为中心的事件数据中提取和编码特征的一般框架。 我们从天而性地计算以物体为中心的事件数据的特点, 导致精确的测量。 此外, 我们为这些特性提供了三种编码的编码: 表格、 顺序、 和 图表性编码是大量使用的过程, 图表编码是一种保存以物体为中心的数据结构的新技术。 我们提供了六种例子: 视觉化和以物体为对象中心特性的预测, 我们用一个指标性模型来解释每一项的预测。