项目名称: 删失数据超高维共线性模型的变量选择
项目编号: No.11726616
项目类型: 专项基金项目
立项/批准年度: 2018
项目学科: 数理科学和化学
项目作者: 董莹
作者单位: 大连民族大学
项目金额: 10万元
中文摘要: 超高维数据的降维是当今统计学研究的前沿课题。本项目拟研究带有删失数据的超高维统计模型的变量选择问题,尤其是超高维协变量之间具有高度相关关系(即共线性关系)的变量选择问题。尽管对删失数据的变量选择已有一些研究成果,但对于区间删失数据模型的变量选择的研究尚少。因此,为了避免超高维数据带来的共线性问题的困扰,本项目拟对带区间删失数据的广义线性模型提出推广的组合惩罚,构造新的惩罚似然函数,发展新的变量筛选方法。在超高维框架下发展新的变量筛选方法以实现充分降维,探索合适的算法,将理论成果应用于实际数据分析。其研究可以丰富惩罚类变量选择的方法体系,也为生存数据分析领域的应用提供理论基础。
中文关键词: 删失数据;共线性模型;变量选择;超高维模型;Oracle性质
英文摘要: Dimensionality reduction of high-dimensional data is a frontier topic nowadays. This project aims to solve the variable selection in ultra-high dimensional models with censored data, especially for the collinearity models. Although there was much literature about censored data, there was no systematic theoretical investigation of simultaneous variable selection and coefficients estimation in the continuous generalized linear model with current status data. The existence of high correlation among variables in high-dimensional data can cause a serious problem of collinearity, therefore the main focus of this study is to resolve this issue. We propose a new combined-penalization which mixed by a nonconcave penalized function and the ridge. Inspired by Sure Independence Screening(SIS) method, we explore the appropriate algorithm and explain the corresponding data in the reality. This research can enrich the variable selection method system of penalization, as well as provide theoretical basis for the application in the area of survival analysis.
英文关键词: Censored data;Collinearity model;Variable selection;Ultra-high dimensional model;Oracle property