项目名称: 基于非独立同分布样本的统计学习理论研究与应用
项目编号: No.61473328
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 其他
项目作者: 张超
作者单位: 大连理工大学
项目金额: 58万元
中文摘要: 经典的统计学习理论结果大多是基于样本独立同分布假设的,然而此假设在很多实际问题中无法满足,例如:信道估计、时间序列预测和泛函数据分析等。尽管已有诸多非独立同分布学习模型被应用到实际问题中,但是对这类学习模型的理论分析还相对薄弱。本项目拟将经典的统计学理论结果推广到非独立同分布学习问题中去。考虑到非独立同分布学习过程的复杂性,我们将研究几种基于代表性随机过程的学习过程,其中包括:near-epoch dependence、L1-mixingale、Markov过程和高斯过程等。对于每一种学习过程,我们将得到适用于该学习过程的偏差不等式和对称不等式并求得泛化界,进而分析此学习过程的一致性和收敛率。我们还将研究由时间泛函构成的函数类复杂度的性质以及非独立同分布学习模型的可学习性等。并以得到的理论结果为基础,我们将归纳出非独立同分布学习问题的共性并对已有的算法模型进行改进。
中文关键词: 统计学习理论;非独立同分布;一致性;泛化界;经验风险最小化
英文摘要: The classical results of statistical learning theory are almost built under the assumption that samples are independently drawn from an identical distribution. However, this assumption is not always valid in practice, for example, estimation of information channel,prediction of time serials and functional data analysis.There have been many learning models developed for the non-i.i.d. learning problems, but the relevant theoretical analysis is still in the beginning stages. In this project, we will extend the classical results of the statistical learning theory to the scenario of non-i.i.d. samples. Since the non-i.i.d. scenario contains a wide variety of cases, it is impossible to find a unified form to cover all the cases. Instead,one feasible scheme is to find some representative processes,e.g., near-epoch dependence, L1-mixingale, Markov processes and Gaussian processes, which cover several useful cases in the scenario of non-i.i.d. samples, and then we study the theoretical properties of the learning process for each representative stochastic process individually. For each learning process, we will obtain the corresponding deviation inequalities, symmetrization inequalities and generalization bounds. We then analyze the consistence and the rate of convergence of the learning process. Next, we will study the complexity measures of the function classes evaluated at the real domain and the time interval and analyze the learnability of the existing learning models for the non-i.i.d. scenario. Based on the theoretical findings, we will induce the general characteristics of the non-i.i.d. learning processes and then improve the existing non-i.i.d. models.
英文关键词: statistical learning theory;non-i.i.d.;consistence;generalization bound;empirical risk minimization