The paper investigates the efficacy of parameter shrinkage on count data models through the use of penalized likelihood methods. The goal is to fit models to count data where multiple independent count variables are observed with only a moderate sample size per variable. The possibility of zero-inflated counts is also plausible for the data. In the context considered here, elementary school-aged kids were given passages of different lengths to read. We aim to find a suitable model that accurately captures their oral reading fluency (ORF) as measured by number of words read incorrectly (WRI) scores. The dataset contains information about the length of the passages (number of words) and WRI scores obtained from recorded reading sessions. The idea is to find passage-level parameter estimates with good MSE properties. Improvement over maximum likelihood MSE is considered by applying appending penalty functions to the negative log-likelihood. Three statistical models are considered for WRI scores, namely the binomial, zero-inflated binomial, and beta-binomial. The paper explores two types of penalty functions resulting in estimators that are either closer to $0$ or closer to the equivalent parameters corresponding to other passages. The efficacy of the shrinkage methods are explored in an extensive simulation study.
翻译:本文通过使用惩罚性可能性方法调查计算数据模型参数缩缩的功效。 目标是将观察多独立计数变量的数据模型与每个变量的中度样本大小相匹配。 对数据来说, 零膨胀计数的可能性也是有道理的。 在此处审议的背景下, 小学适龄儿童可以读取不同长度的段落。 我们的目标是找到一个合适的模型, 准确捕捉他们的口读流( ORF), 以错误读取( WRI) 分数的字数来衡量。 数据集包含关于从记录读取的段落长度( 字数) 和 WRI 分数的信息。 其想法是找到具有良好MSE 属性的跨行级别参数估计值。 考虑通过对负日志相似性应用附加惩罚功能来提高最大的可能性。 我们考虑三个统计模型, 即二进制、 零进缩二进制和 bebinomial 分数。 本文探索了两种类型的惩罚函数, 导致从记录阅读会话中获得的长度( 字数) 和 WRI 评分数 。 。 在模拟研究中, 对等同式的参数进行 。