项目名称: 分布式有监督学习的学习理论
项目编号: No.61502342
项目类型: 青年科学基金项目
立项/批准年度: 2016
项目学科: 自动化技术、计算机技术
项目作者: 林绍波
作者单位: 温州大学
项目金额: 20万元
中文摘要: 进入大数据时代,机器学习面临两个重大挑战,即如何设计能够适用于大数据的机器学习算法,以及如何发展相应的理论来支撑其应用。针对第一个挑战,众多学者提出了利用分而治之策略来处理数据的分布式学习方法。虽然有大量的文献从工程的角度证明了这种方法的可行性,但是迄今为止还没有完整的理论来支撑其应用。本项目就分布式学习的统计性态、分布学习算法的收敛性、学习过程的复杂性等基础理论问题开展研究,拟建立一套完整的适用于分布式有监督学习的学习理论。主要内容包括:第一,从理论上证明分布式有监督学习的可行性及优越性;第二,建立适用于分布式有监督学习的泛化误差分解体系并导出其泛化误差。第三,从理论的角度揭示该如何有效地使用分布式学习算法来处理监督学习问题。
中文关键词: 统计学习理论;泛化能力;泛化误差
英文摘要: Machine learning encounters two fundamental challenges in the big data age, namely how to design machine learning algorithms that can be applied to the big data process and how to provide a theoretical analysis framework for the algorithms. Distributed learning employs the “divide-and-conquer” strategy to attack the machine learning problem,and then becomes a state-of-the-art learning scheme in the big data era. Compared with enormous research activities on the applications, the theoretical study of the distributed learning algorithms lags heavily behind. In this project, we focus on presenting a systemic theoretical analysis for the distributed supervised learning in the framework of statistical learning theory. To this end, we will first verify the feasibility and outperformance of the distributed learning. Then, we will develop an exclusive error decomposition strategy for the distributed supervised learning and deduce its generalization error bound. Finally, we will present a theoretical guidance for how to design an efficient distributed learning algorithm.
英文关键词: Statistical learning theory;Generalization capability;Generalization error