基于语义距离的分布式数据挖掘理论与方法

项目名称： 基于语义距离的分布式数据挖掘理论与方法

项目编号： No.71271076

项目类型： 面上项目

立项/批准年度： 2013

项目学科： 管理科学

项目作者： 刘滨

作者单位： 河北科技大学

项目金额： 55万元

中文摘要： 以电子商务个性化推荐、销售预测问题为工程背景和应用面向，围绕分布式数据挖掘(DDM)模型的构建和求解问题，从消除语义分割式独立挖掘的质量隐患入手，综合多种本体匹配策略，建立从局部到整体度量数据源间语义距离的复合量化体系，提炼出数据源间的语义本质差异，据之建立数据源分组的层次化挖掘体系；进而研究层次式筛检结果的质量考察方法和知识整合模型、以层为资源单位的负载平衡机制；继而从结构化的角度构建具有可操作性的、侧重质量兼顾效率的层次化DDM模型。针对模型求解,设计多算法（神经网络、遗传算法等）集成的智能计算方法，构建Web服务库和Agent主导的服务组合模型；并基于JAFMAS框架设计多Agent工作机制；建立强化语义理解挖掘过程和结果的、能提高用户参与度的人机交互机制。最后，结合具体案例验证模型和算法的有效性。本研究将丰富和完善DDM理论和方法，在电子商务个性化推荐、销售预测等领域应用前景广泛。

中文关键词： 语义距离；本体；数据挖掘；数据可视化；

英文摘要： In this project, personalized recommendation and sales forecasting in e-commerce are considered as the engineering background and oriented application. Around the issues in the construction and solution of distributed data mining (DDM) model, this research starts with the motivation of eliminating quality risks produced by independent mining in a semantic segmentation way. We will utilize multi-strategy ontology matching to build a compound quantization architecture for measuring the semantic distance from local to the whole between data source ontologies. With the architecture, the essential semantic difference between data sources can be found, and the hierarchical data mining architecture will be set up sequentially. Secondly, we will develop the quality inspection method for hierarchically filtering the intermediate results, knowledge integration model, and load balancing mechanisms based on layer-unit. Then, from a structural point of view, a workable hierarchical DDM model will be proposed, which focuses on the quality as well as the efficiency. To provide a solution of the DDM model, the intelligent computing method which can integrate multiple algorithms (neural network,genetic algorithm, etc.) will be designed; the web service library and agent-oriented service composition model will be built sequentia

英文关键词： semantic distance；ontology；data mining；data visualizaiton；

成为VIP会员查看完整内容