互联网环境下中文实体知识挖掘关键技术研究

项目名称： 互联网环境下中文实体知识挖掘关键技术研究

项目编号： No.61202329

项目类型： 青年科学基金项目

立项/批准年度： 2013

项目学科： 计算机科学学科

项目作者： 刘康

作者单位： 中国科学院自动化研究所

项目金额： 23万元

中文摘要： 从复杂多变的网络数据中挖掘实体、实体类别以及实体关系等知识并进行组织，建立知识间的语义关联，对于文本内容理解、信息检索和问答系统等都具有重要的支撑作用。本申请针对互联网数据"海量不确定"、"多源异构"、"动态变化"、"含噪"等特点，研究互联网环境下的中文实体知识挖掘技术，具体研究内容包括：（1）面向 "关系多样化、可计算、概率化描述"的知识表示需求，研究基于多层语义图的实体知识表示及其知识体系自动构建方法；（2）充分利用网络信息间的差异性、互补性和相关性，研究基于网络信息关联的中文实体知识协同挖掘和验证方法；（3）研究大规模概率化逻辑推理方法，从知识推理的角度探索网络新知识的获取方法；（4）构建实验性实体知识库，并在课题组已有的百科知识问答系统平台上，对以上关键技术进行验证与测试。本申请课题的研究成果将为自然语言理解、互联网信息深度计算等提供参考。

中文关键词： 开放域信息抽取；实体；实体关系；；

英文摘要： Mining entity knowledge (entities, categories and the relationships) will produce significant impact on many applications, such as text content understanding, information retrieval and question answering systems. This application studies the technologies of mining Chinese entity knowledge from the massive, uncertain, multi-source heterogeneous, dynamic and noisy Web data. The main tasks include: (1) Aiming at demands about diversification of relations and the probabilistic description for the knowledge representation, we study the multi-layer semantic graph based knowledge presentation and the automatic construction method of knowledge framework. (2) Making full use of the differences, complementarity and correlation between the Web information, we study the collabrative methods of mining and verifying entity knowledge from the Web. (3) We study the method of new knowledge acquisition from the view of the large-scale probabilistic logic reasoning. (4) We construct the experimental entity knowledge base and test the above key techniques on the existing Chinese Encyclopedia QA platform. The achievements of this project will provide some valuable suggestion for natural language understanding and deep web information computation

英文关键词： Open Information Extraction；Entity；Entity Relation；；

成为VIP会员查看完整内容

相关内容

实体

关注 12

实体（entity）是有可区别性且独立存在的某种事物，但它不需要是物质上的存在。尤其是抽象和法律拟制也通常被视为实体。实体可被看成是一包含有子集的集合。在哲学里，这种集合被称为客体。实体可被使用来指涉某个可能是人、动物、植物或真菌等不会思考的生命、无生命物体或信念等的事物。在这一方面，实体可以被视为一全包的词语。有时，实体被当做本质的广义，不论即指的是否为物质上的存在，如时常会指涉到的无物质形式的实体－语言。更有甚者，实体有时亦指存在或本质本身。在法律上，实体是指能具有权利和义务的事物。这通常是指法人，但也包括自然人。

顾及时空特征的地理知识图谱构建方法

专知会员服务

55+阅读 · 2022年2月15日

空间数据智能：概念、技术与挑战

专知会员服务

92+阅读 · 2022年2月3日

UIUC韩家炜：从海量非结构化文本中挖掘结构化知识

专知会员服务

98+阅读 · 2021年12月30日

电子病历文本挖掘研究综述

专知会员服务

73+阅读 · 2021年3月27日