The paper considers parameter estimation in count data models using penalized likelihood methods. The motivating data consists of multiple independent count variables with a moderate sample size per variable. The data were collected during the assessment of oral reading fluency (ORF) in school-aged children. Specifically, a sample of fourth-grade students was given one of ten possible to read with passages differing in length and difficulty. The observed number of words read incorrectly (WRI) is used to measure ORF. The goal of this paper is to efficiently estimate passage difficulty as measured by the expected proportion of words read incorrectly. Three models are considered for WRI scores, namely the binomial, the zero-inflated binomial, and the beta-binomial. Two types of penalty functions are considered for penalized likelihood, respectively with the goal of shrinking parameter estimates either closer to zero or closer to one another. A simulation study evaluates the efficacy of the shrinkage estimates using Mean Square Error (MSE) as a metric, with big reductions in MSE relative to maximum likelihood in some instances. The paper concludes by presenting an analysis of the motivating ORF data.
翻译:激励数据包括多个独立的计数变量,每个变量的样本大小略小。这些数据是在评估学龄儿童口述阅读流畅(ORF)期间收集的。具体地说,四年级学生的抽样被给予10种可能的抽样,其长度和难度各有不同。观察到的读错字数(WRI)被用来测量ORF。本文的目的是根据错误读错字的预期比例,有效估计读错字难度。三个模型被考虑为WRI分数,即二进制、零进化二进制和乙二进制。两种惩罚功能被认为具有受罚的可能性,目的分别是将参数估计缩小到接近零或更接近一个。模拟研究用中方错误(MSE)作为衡量尺度评估缩小估计值的功效,在某些情况下,MSE与最大可能性相比大幅下降。论文最后通过分析激励的OSF数据而得出结论。