In the era of data explosion, statisticians have been developing interpretable and computationally efficient statistical methods to measure latent factors (e.g., skills, abilities, and personalities) using large-scale assessment data. In addition to understanding the latent information, the covariate effect on responses controlling for latent factors is also of great scientific interest and has wide applications, such as evaluating the fairness of educational testing, where the covariate effect reflects whether a test question is biased toward certain individual characteristics (e.g., gender and race) taking into account their latent abilities. However, the large sample size, substantial covariate dimension, and great test length pose challenges to developing efficient methods and drawing valid inferences. Moreover, to accommodate the commonly encountered discrete types of responses, nonlinear latent factor models are often assumed, bringing further complexity to the problem. To address these challenges, we consider a covariate-adjusted generalized factor model and develop novel and interpretable conditions to address the identifiability issue. Based on the identifiability conditions, we propose a joint maximum likelihood estimation method and establish estimation consistency and asymptotic normality results for the covariate effects under a practical yet challenging asymptotic regime. Furthermore, we derive estimation and inference results for latent factors and the factor loadings. We illustrate the finite sample performance of the proposed method through extensive numerical studies and an application to an educational assessment dataset obtained from the Programme for International Student Assessment (PISA).
翻译:暂无翻译