项目名称: 过程大数据中合意路线获取关键技术研究
项目编号: No.61472207
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 其他
项目作者: 闻立杰
作者单位: 清华大学
项目金额: 80万元
中文摘要: 该项目旨在开发能处理记录了在高度变化和异构上下文中被执行过程的海量事件日志的新型过程挖掘技术。过程挖掘的目标是从事件日志中抽取过程相关信息,即自动发现一个过程模型。尽管过程挖掘有很多最新进展,但仍然存在很多亟待解决的重要挑战。事实上,从事件日志中发现过程模型的难度众所周知,而且过程挖掘的大规模应用需要重大的突破。该项目旨在进行这些突破,它由三个研究主题组成:一、将开发把过程挖掘问题(如过程发现和复合型检查)分解为能被更高效解决并且能分布到计算机集群的较小问题的技术;二、为了支持无法在极长一段时间内存储全部事件的应用,将开发不需要存储过量事件就能够学习(或检查)过程模型的即时过程挖掘技术;三、当前的过程挖掘技术要求分析师将范围限制在描述了稳态下一组同质案例的行为的单一过程模型上,将开发能够系统地突出共性和差异的可比较过程挖掘技术,以便能够处理随着时间发生改变而且有很多变种的异质过程。
中文关键词: 过程挖掘;大数据;分布式算法;合意路线;即时挖掘
英文摘要: The research project aims at developing new process mining techniques that are able to deal with huge event logs recorded for processes executed in possibly highly variable and heterogeneous contexts. The goal of process mining is to extract process-related information from event logs, e.g., to automatically discover a process model. Despite recent advances in process mining there are important challenges that need to be addressed. In fact, the discovery of process models from event logs is notoriously difficult and major breakthroughs are needed for the large-scale application of process mining. This project is composed of three research tracks aiming at such breakthroughs: (a) In Track T1 we will develop techniques to decompose process mining problems (e.g., process discovery and conformance checking) into smaller problems that can be solved more efficiently and that can be distributed over a network of computers. (b) Track T2 goes one step further. To support applications where it is impossible to store events over an extended period, on-the-fly process mining techniques will be developed that can learn (or check) process models without storing excessive amounts of events. (c) Existing techniques require the analyst to restrict the scope to a single process model describing the behavior of a homogeneous group of cases in steady-state. In Track T3 we will develop comparative process mining techniques that systematically highlight commonalities and differences. This way we can deal with heterogeneous processes that are changing over time and that have many variants.
英文关键词: Process Mining;Big Data;Distributed Algorithms;Desirable Line;On-the-fly Mining