项目名称: 基于异构特征融合的社会媒体用户分类关键技术研究
项目编号: No.61300209
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 自动化技术、计算机技术
项目作者: 张海军
作者单位: 哈尔滨工业大学
项目金额: 25万元
中文摘要: 随着社会媒体的普及应用,社会媒体用户分类已经成为很多应用领域广泛关注的研究问题,其面临着与传统的文本、图像等单个对象分类所不同的三个重要挑战:异构特征、多特征复杂关联和多类标问题。课题研究的关键科学问题是:如何针对社会媒体用户设计合理的特征空间表示模型,并研究有效的异构特征融合方法,从而在多类标环境下使社会媒体用户的分类算法得到优化。本课题首先提出面向社会媒体用户的多关系层级结构图表示模型。在此基础上,重点研究面向社会媒体用户的异构特征融合技术和多类标分类技术,具体内容包括:(1)基于内容的异构特征子图匹配算法,用于解决文本、图像和用户信息及评论的融合问题;(2)基于时序特征融合的动态匹配算法,用于融合时间信息并解决在时间动态环境下的用户相似度衡量问题;(3)基于社会关系融合的社会媒体用户多类标分类算法,用于融合用户的社会关系信息,并同时解决多类标分类问题。
中文关键词: 社会媒体用户;异构特征;多类标;分类;
英文摘要: With the rapid development of social media, the last few years have witnessed growing interests from different fields in classifying social media users. Unlike the text or image classification, classification of social media users has three challenging issues: (1) users are of heterogeneous features; (2) multiple features are usually correlated; and (3) users may have multi-labels. The essential issues of this project are finding how to design an appropriate model to represent the feature space of users, studying effective methods for integrating heterogeneous features, and optimizing the classification algorithms under a multi-label learning environment. In this project, on the basis of a hierarchical framework to represent the multi-relational property of social media users, we aim at studying the fusion techniques for heterogeneous features and the multi-label classification techniques for social media users. The contents of this research are threefold: (1) a content based matching algorithm for heterogeneous features graphs, which aims to solve the fusion process of text, image, and user information, etc.; (2) a dynamical matching algorithm, which integrates the reported time information of social media such that the user-between similarity can be evaluated under a dynamical environment; (3) a social relatio
英文关键词: Social media user;Heterogeneous feature;Multilabel;Classification;