In modern enterprises, Business Processes (BPs) are realized over a mix of workflows, IT systems, Web services and direct collaborations of people. Accordingly, process data (i.e., BP execution data such as logs containing events, interaction messages and other process artifacts) is scattered across several systems and data sources, and increasingly show all typical properties of the Big Data. Understanding the execution of process data is challenging as key business insights remain hidden in the interactions among process entities: most objects are interconnected, forming complex, heterogeneous but often semi-structured networks. In the context of business processes, we consider the Big Data problem as a massive number of interconnected data islands from personal, shared and business data. We present a framework to model process data as graphs, i.e., Process Graph, and present abstractions to summarize the process graph and to discover concept hierarchies for entities based on both data objects and their interactions in process graphs. We present a language, namely BP-SPARQL, for the explorative querying and understanding of process graphs from various user perspectives. We have implemented a scalable architecture for querying, exploration and analysis of process graphs. We report on experiments performed on both synthetic and real-world datasets that show the viability and efficiency of the approach.
翻译:在现代企业中,业务流程(BPs)是通过工作流程、信息技术系统、网络服务和人员直接协作的混合组合实现的,因此,流程数据(即包含事件日志、互动信息和其他流程文物的英国石油公司执行数据)分散在多个系统和数据源,并越来越多地显示大数据的所有典型特性。理解流程数据的实施具有挑战性,因为关键企业洞察力仍然隐藏在流程实体之间的互动中:大多数物体是相互关联的,形成复杂、多样但往往是半结构化的网络。在业务流程中,我们认为大数据问题是来自个人、共享和商业数据的大量相互关联的数据层。我们提出了一个框架,将数据作为图表(即流程图)进行模拟处理,以汇总流程图,并找出基于数据对象及其在流程图中互动的实体的概念等级。我们介绍了一种语言,即BP-SPARQL,用于从各种用户角度探索和理解流程图。我们用一个可扩展的架构,用于将数据作为图表的图表、即流程图解和合成效率分析。我们用一个可扩展的架构,用于进行真实世界性数据分析、探索和合成数据分析。