项目名称: 融合文本内容与结构信息的话题分析方法研究
项目编号: No.61472088
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 计算机科学学科
项目作者: 黄萱菁
作者单位: 复旦大学
项目金额: 83万元
中文摘要: 近年来社会媒体在我国取得了蓬勃发展,所发布和传播的信息提供了人们在日常生活中争相讨论的热门话题,对社会舆论产生了广泛的影响力。由于传统的话题分析研究主要以新闻报道作为处理对象,无法充分结合社会媒体所具有的信息内容、社交网络和用户行为等重要特性。因此,本项课题研究具有重要的学术和应用价值。我们拟针对社会媒体,从话题表示与建模、话题发现与跟踪、话题结构和语义分析等方面开展融合文本内容和结构的话题分析方法研究,具体内容包括:1)综合考虑社会媒体的重要特性,建立融合结构和语义的话题表示模型;2)研究基于非参数贝叶斯方法的话题检测与跟踪算法、社会媒体和新闻媒体的关联挖掘方法、话题传播分析与预测算法;3)根据所构建的话题表示模型,研究基于结构化机器学习的话题结构和语义框架分析算法,以及基于主题模型的话题关键词抽取算法。通过本项课题研究,我们拟在CCF推荐的国际学术会议或期刊发表论文15篇以上。
中文关键词: 自然语言处理;语义分析;话题检测与跟踪
英文摘要: In recent years, social media has grown and flourished in China. It publishes and disseminates various types of information, and then provides hot topics of discussion by the people in their daily lives. It has a wide influence on public opinion, and brings significant impacts to traditional news media and human society. On the other hand, the traditional topic analysis research mainly aims at news reports, thus cannot fully integrate the major characteristics of social media, including information content, social network and user behavior. Therefore, the research on topic analysis integrating text content and structure information is of both important academic value and practical significance, which can contribute to the maintenance of social stability and national information security. This project focuses on the research of topic analysis on social media. We will carry on the research of topic representation and modeling, topic detection and tracking, as well as topic structure and semantic analysis. We will study: 1) the topic models to integrate structural and semantic information which consider the important factors of information content, social network and user behavior; 2) the topic detection and tracking methods based on Nonparametric Bayes approaches, the association mining approaches between the social media and the traditional media, and the algorithms to analyze and predict topic propagation; and 3) the topic structure and semantic analysis methods based on structural machine learning, and topic-oriented keyword extraction algorithms based on topic models.?We intend to publish more than 15 papers in international conferences or journals recommended by China Computer Federation (CCF) during the research.
英文关键词: Natural Language Processing;Semantic Analysis;Topic Detection and Tracking