项目名称: 复杂多元数据的半参数统计推断
项目编号: No.11471272
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 数理科学和化学
项目作者: 王海斌
作者单位: 厦门大学
项目金额: 60万元
中文摘要: 实际中存在大量的多元响应数据。除了离散型和连续性外,还有各种复杂的数据类型,如心理学、教育学、生物医药和经济中常用的有序分类数据和排名数据等。基于单指标降维技术,本项目将通过发展一系列多元半参数模型和方法来分析这些复杂数据。本项目的创新之处至少包括两点:1)不仅把单指标降维技术应用于复杂多元响应数据的条件均值结构的建模上,还将应用于复杂多元响应数据的协方差结构和条件协方差结构的建模上;2)进入指标的解释变量不仅为可观测的情形,还包括不可直接观测的情形,对于后者,首先解决模型的可识别性问题。本项目将采用自由节点的Bayes方法对建议的模型进行全面的统计分析,包括估计、检验和模型选择等。为构造高效收敛的算法,还将采用一些加速技术。通过本项目的实施,将为分析复杂多元数据提供半参数统计理论和方法。
中文关键词: 半参数模型;多元统计分析;单指标模型;潜在变量;结构方程模型
英文摘要: Apart from the discrete and continuous data, there exist various complicated multivariate response data in practice, for example, ordinal categorical data and ranking data that are often-used in psychometric, educational, biological, medical and economical sciences. Based on the single-index dimension-reduction technique, we will develop a series of semi-parametric multivariate statistical models and methods for analyzing those multivariate data in the proposed project. In detail, we will adopt multivariate single-index-type models, multivariate single-index-type probit models and generalized multivariate single-index-type models, to analyze the multivariate continuous responses, the multivariate ordinal categorical data and other data including the ranking data, respectively. Besides those, we will also develop single-index-type structural equation models to model the covariance structure or the conditional covariance structure of those complicated multivariate data. There are at least two innovations in the proposed project: 1) We not only extend the single-index-type models from univariate response to multivariate responses, but also introduce the single-index technique to the research area of structural equation models; 2) In additional to the case that all of the covariates in the indices are the observable variables, we also consider the case that the unobservable (latent) variable(s) corresponding to ordinal categorical and ranking variable(s) acts as the covariate(s) to satisfy the practice needs. For the latter, we first investigate the identifiability conditions of the models. We will make use of Bayesian approach of free-knot splines (i.e., the nonparametric functions in the model are approximated by splines but the numbers and locations of knots are treated as random variables) to make inference about the proposed models via sampling from the joint posterior. The advantage of Bayesian method is that a multivariate response model can be converted into a series of univariate response models by deriving the fully conditional posteriors, which greatly facilitate the sequent analysis. To obtain efficient and fast-convergent algorithms, we will apply acceleration techniques such as generalized Gibbs sampler, alternating subspace-spanning resampling, and partially collapsed Gibbs sampler to our methods of analysis. The research results of the proposed project will lay a solid foundation for the methodological research on the analysis of the complicated multivariate response data.
英文关键词: Semiparametric models;Multivariate statistical analysis;Single-index models;Latent variable;Structural equation models