项目名称: 面向群体协同开发的软件工程关联数据挖掘
项目编号: No.61472242
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 自动化技术、计算机技术
项目作者: 沈备军
作者单位: 上海交通大学
项目金额: 76万元
中文摘要: 群体软件工程正成为云时代软件开发新模式,它利用群体开发力量和群体智能技术,快速构造规模庞大、功能复杂、技术创新的软件。然而群体协同开发面临大规模数据的挑战:几十万名开发人员、千万行代码、大量的需求、设计模型、测试用例、缺陷、变更、任务、讨论记录和邮件等等,如何从这些分布异构的大规模数据中高效地知晓信息和发现知识成为难题。本项目将语义网引入软件工程领域,把这些多源异构数据进行细粒度语义关联,研究关联数据驱动的软件工程数据挖掘的新方法和新技术,重点研究:①建立基于本体的软件工程关联数据模型,提出本体标注和RDF生成方法,实现关联数据的自动构建;②建立软件工程关联数据挖掘框架,实现大规模关联数据的存储和查询;③在此基础上探索基于数据关联特征的软件产品质量预测、基于关联分析的关联修改推荐和基于数据关联图挖掘的群体开发协同模式发现技术和算法;以实现覆盖群体软件开发全生命周期的软件智能。
中文关键词: 群体软件工程;数据挖掘;关联数据;本体标注
英文摘要: Crowd software engineering has becoming a new software development mode in cloud era, which rapidly constructs large scale software with complex functions and technological innovation using crowd efforts and swarm intelligence. However, it faces challenges from Software engineering big data: hundreds of thousands of developers, over ten millions of lines of code, huge amounts of requirements, design models, test cases, defects, changes, plan and tasks, discuss records, email messages and etc. Therefore, information awareness and knowledge discovery from these distributed, heterogeneous and massive data becomes a difficult problem. The project will introduce Semantic Web into software engineering, interlink and integrate these software artefacts, and explore linked data driven semantic query and mining methods and technologies. The project focuses on (1) Establish the ontology based software engineering linked data model, and propose ontology annotation and RDF data generation method for software engineering linked data; (2) Construct the linked data driven unified framework for software engineering data semantic query and mining, and realize efficient query on massive linked data; (3) Explore the novel technologies and algorithms of software product quality prediction using data link features, pertinent artifacts recommendation using association analysis,and crowd development collaboration pattern discovery using data link graph mining. Therefore, software intelligent will be realized to support crowd software development life cycle.
英文关键词: Crowd Software Engineering;Data Mining;Linked Data;Ontology Annotation