项目名称: 带类噪声的大规模张量分类算法研究
项目编号: No.61273295
项目类型: 面上项目
立项/批准年度: 2013
项目学科: 自动化技术、计算机技术
项目作者: 杨晓伟
作者单位: 华南理工大学
项目金额: 58万元
中文摘要: 带类噪声的大规模分类问题广泛存在于网页分类、文本分类、基于内容的视频检索、脱机手写体字符识别和生物信息处理等领域,是当前数据挖掘和机器学习领域中的重要研究课题,已引起了国内外计算机、自动化、电信和数学领域研究人员的广泛关注。在本项目中,为了解决带类噪声的大规模张量分类问题,在支持向量机的理论框架下,我们拟开展四个方面的研究工作:(1)针对二分类问题,建立支持张量机模型;(2)针对带类噪声的中小规模二分类问题,建立鲁棒支持张量机模型;(3)针对多分类问题,设计缩减的一对多支持张量机算法;(4)针对带类噪声的大规模分类问题,设计基于CF树聚类和局部学习的支持张量机算法,并分析局部学习的误差界。通过本课题的研究,不仅可以建立相关的支持张量机模型,设计适用性更好的算法,解决实际的带类噪声大规模分类问题,而且可以丰富数据挖掘和机器学习的研究内容,同时还可以推动机器学习和数学理论的发展。
中文关键词: 张量;大规模;类噪声;分类;支持向量机
英文摘要: There exist a lot of large scale classification problems with class noises in the fields of web page categorization, text categorization, content based video retrieval, offline hand-written character recognition and biological information processing. It is an important issue how to design some effective algorithms to deal with these problems in the fields of data mining and machine learning, and has attracted more and more interests from the researchers around the world in the fields of computer, automation, telecomunication and mathematics. In this proposal, under the theoretical framework of support vector machines, we will do our research work on the following four points to deal with large scale tensor classification problems with class noises: (1) Bulid the support tensor machine models for binary classification problems; (2) Bulid the robust support tensor machine models for medium-sized and small-sized binary classification problems; (3) Design the reduced one-against-all support tensor machine algorithm for multi-class classification; (4) Design the CF-tree clustering and local learning based support tensor machine algorithms for large scale classification problems with class noises, and analyze the error bound of local learning. What is of significance in this proposal will be not only building some sup
英文关键词: Tensor;Large Scale;Class Noise;Classification;Support Vector Machine