项目名称: 面向汉语-泰语跨语言新闻事件检索方法研究
项目编号: No.61462054
项目类型: 地区科学基金项目
立项/批准年度: 2015
项目学科: 计算机科学学科
项目作者: 王红斌
作者单位: 昆明理工大学
项目金额: 45万元
中文摘要: 泰国与我国之间的政治、文化、经济交流越来越频繁,快速查询中泰之间新闻事件的发展和变化,有助于理解泰国在政治、外交、经济和军事等不同领域的政策和态度。本课题将围绕汉语-泰语新闻事件特征要素及关联关系抽取、新闻事件存储和跨语言新闻事件检索中的相似度计算进行研究。在新闻事件特征要素及关联关系抽取方面,研究基于多特征融合的篇章级新闻要素实体关系抽取方法,构建实体、动词及它们之间关联关系的新闻事件链;在新闻事件存储方面,考虑新闻事件链的结构特征,研究汉语-泰语新闻事件图存储模型和索引策略;在跨语言相似度计算方面,基于双语词典互译信息、事件实体、实体关系、事件结构等信息的特征表示方法,研究融合新闻事件特征的汉语-泰语跨语言新闻事件相似度计算方法。课题将解决汉语-泰语跨语言新闻事件检索中的信息抽取、存储和相似度计算等问题,具有重要的理论研究与应用价值。
中文关键词: 汉语-泰语;新闻事件;信息检索;相似度计算;图存储结构
英文摘要: Since the exchanges of politic, cultural and economic between Thailand and China is becoming frequent, retrieving news events of Thailand and China is significantly helpful for understanding the attitudes of Thailand on politic, diplomacy, economic and military. This project will put the emphasis on extracting the characteristic elements and incidence relations from news events in Chinese and Thai. Our study also includes storage strategy of news events and computation of the similarity in cross-language news events retrieve. For the extraction of the characteristic elements and incidence relations, we will study document-level news features and entities' relations extraction method based on Multi-feature fusion, constructing the main news events chain of entities, verbs, and the incidence relations among them; For the news storage strategies, considering the characteristics of news events chain structures, we will construct a graph storage model of Chinese-Thai news events and study the retrieve strategies; For the computation of the similarity in cross-language news events retrieve, we will study it combining the features of news events, using the character representation methods based on the information such as Bilingual dictionary translation, event entity, entity relation, event structure, and so forth. The target of this project is solving information extraction, storage, and similarity computation on Chinese-Thai cross-language news events retrieve, which is valuable for both theoretical research and practical implement.
英文关键词: Chinese-Thai;News Events;Information Retrieval;Similarity Computation;Graph Storage Structure