高效率地检查与图表编码事件日志相比的定时秩序合规规则 (Efficient Checking of Timed Order Compliance Rules over Graph-encoded Event Logs)

Validation of compliance rules against process data is a fundamental functionality for business process management. Over the years, the problem has been addressed for different types of process data, i.e., process models, process event data at runtime, and event logs representing historical execution. Several approaches have been proposed to tackle compliance checking over process logs. These approaches have been based on different data models and storage technologies including relational databases, graph databases, and proprietary formats. Graph-based encoding of event logs is a promising direction that turns several process analytics tasks into queries on the underlying graph. Compliance checking is one class of such analysis tasks. In this paper, we argue that encoding log data as graphs alone is not enough to guarantee efficient processing of queries on this data. Efficiency is important due to the interactive nature of compliance checking. Thus, compliance checking would benefit from sub-linear scanning of the data. Moreover, as more data are added, e.g., new batches of logs arrive, the data size should grow sub-linearly to optimize both the space of storage and time for querying. We propose two encoding methods using graph representation, realized in Neo4J, and show the benefits of these encoding on a special class of queries, namely timed order compliance rules. Compared to a baseline encoding, our experiments show up to 5x speed up in the querying time as well as a 3x reduction in the graph size.

翻译：根据流程数据验证合规规则是业务流程管理的一个基本功能。多年来,对不同类型流程数据,即流程模型、运行时的流程事件数据和代表历史执行的事件日志,已经解决了问题。提出了几种方法来处理对流程日志的合规检查。这些方法基于不同的数据模型和存储技术,包括关系数据库、图表数据库和专有格式。基于图表的事件日志编码是一个有希望的方向,将若干流程分析任务转化为基本图表的查询。合规检查是这类分析任务的一个类别。在本文件中,我们提出仅将日志数据编码为图表并不足以保证高效处理该数据查询。由于对流程日志的交互性检查,效率非常重要。因此,合规检查将受益于数据亚线扫描。此外,随着数据的增加,例如,新的日志成批量的到来,数据规模应扩大子线,以优化存储空间和查询时间。我们建议使用两种编码方法,即图表格式来保证高效处理该数据的查询。由于遵守性检查具有互动性质,因此,合规性检查将受益于子线条扫描,例如,新的日志到达时,数据规模应扩大分线,以优化存储空间和查询时间空间。我们提议采用两种编码方法,即NEO4JSirx在升级中进行特殊时间查询,在降低要求中,在降低时间查询中,以显示这些时间排序和升级的进度顺序,显示。