项目名称: 复杂环境下机器学习的理论研究
项目编号: No.61503179
项目类型: 青年科学基金项目
立项/批准年度: 2016
项目学科: 其他
项目作者: 高尉
作者单位: 南京大学
项目金额: 22万元
中文摘要: 学习理论的研究对机器学习的发展有着重要的支撑和指导作用。经典学习理论通常研究数据独立同分布、样本单标记、数据可信度高、一般采用精度作为衡量学习性能的准则。随着机器学习不断向更多应用领域拓展,学习环境变得越来越复杂,如数据分布随时空而改变、样本标记多且相互关联、数据包含大量噪声、多种准则从不同角度衡量学习的性能。本项目关注于复杂环境下机器学习的理论研究,拟给出针对分布变化的学习方法泛化性理论分析;拟给出基于标记关系的学习方法泛化性理论分析;拟给出数据噪声的理论分析,在此基础上提出能容忍噪声的学习方法;拟给出面向多种性能准则一致性理论分析,在此基础上提出具有一致性的学习方法。本项目研究可望产生高水平论文4-6篇,申请专利1-2项,培养2-4名研究生。
中文关键词: 机器学习;学习理论;泛化性分析;一致性分析;样本复杂度分析
英文摘要: Learning theory plays an important and directive role in the development of machine learning. Conventional learning theory always considers the cases where data are drawn i.i.d. from a distribution; each example has a single label without noise; the accuracy is used to measure the performance of classifiers. As machine learning comes to wider real applications, the learning environment becomes more and more complex inevitably, e.g., the data distribution varies according to time and space; each instance often has many correlated labels; the data are full of noise; different criterions are used to measure the performance of classifiers from different views. This project focuses on the learning theory under such complex environment. Our goals are to 1)provide theoretical analysis for generalization of classifiers learned from varied distribution; 2)provide theoretical analysis for generalization of classifiers based on label correlation; 3)provide theoretical analysis for noisy data, and suggest noise-free learning algorithms; 4) provide theoretical analysis for consistency on multiple criterions, and suggest consistent learning algorithms. In this project, it is expected to publish 4-6 high-quality papers on important international journals, conferences and top native journals, apply 1-2 patents, and supervise 2-4 graduate students.
英文关键词: Mahine Learning;Learning Theory;Generalization;Consistency;Sample Complexity