项目名称: 缺失数据下广义线性模型的经验似然和变量选择问题
项目编号: No.11201276
项目类型: 青年科学基金项目
立项/批准年度: 2013
项目学科: 数理科学和化学
项目作者: 陈夏
作者单位: 陕西师范大学
项目金额: 22万元
中文摘要: 广义线性模型是经常用来分析不同类型数据的工具。它在应用上,尤其是在生 物、医学和经济、社会数据的统计分析上,有重要的意义。而数据的缺失是应用中经常出现的问题。本项目致力于缺失数据下,广义线性模型的经验似然推断和变量选择问题的研究,包括:1. 在数据随机缺失下,结合处理缺失数据的完全数据方法、逆概率加权方法、广义借补方法等构造辅助随机向量,提出广义线性模型中未知参数的对数经验似然比统计量,继而给出参数的经验似然置信域。2. 在数据随机缺失下,结合惩罚借补估计方程,提出广义线性模型的变量选择方法。从理论上证明所提出的变量选择方法可以相合地识别出真实模型,并且给出回归系数的正则估计的收敛速度。通过数据模拟和实例研究表明所提出的经验似然推断方法和变量选择方法具有较好的有限样本性质。本项目旨在研究缺失数据下,广义线性模型中的经验似然置信域和变量选择问题,为拓展其在实际问题中的应用奠定良好的理论基础。
中文关键词: 经验似然;置信域;测量误差;缺失数据;广义线性模型
英文摘要: Generalized linear models is used to analyze various types of data. Its application, especially in statistical analysis for the biological、medical、economic and social data, has great significance. Moreover, missing data is often encountered in practice. This project is committed to study the empirical likelihood inference and variable selection for generalized linear models with missing data, including: 1. For the generalized linear models with the data missing at random, we consider the empirical likelihood inference for the unknown parameter. By constructing the auxiliary random vector based on the complete-case data method,the inverse probability weighted method and the imputed value method, the empirical log-likelihood ratio function of unknown parameters are proposed and the results can be used to construct the confidence region of parameters. 2. For the data missing at random, we present a variable selection procedure for genearlized linear models based on the penalized estimating equations. We will show that the proposed variable selection procedure can identify the true model consistently and obtain the convergence rate of the regularized estimators. The simulation study and real data example will show that the proposed empirical likelihood method and the variable selection procedure perform well . Thi
英文关键词: Empirical likelihood;Confidence region;Measurement error;Missing data;Generalized linear models