In applications of linear mixed-effects models, experimenters often desire uncertainty quantification for random quantities, like predicted treatment effects for unobserved individuals or groups. For example, consider an agricultural experiment measuring a response on animals receiving different treatments and residing on different farms. A farmer deciding whether to adopt the treatment is most interested in farm-level uncertainty quantification, for example, the range of plausible treatment effects predicted at a new farm. The two-stage linear mixed-effects model is often used to model this type of data. However, standard techniques for linear mixed model-based prediction do not produce calibrated uncertainty quantification. In general, the prediction intervals used in practice are not valid -- they do not meet or exceed their nominal coverage level over repeated sampling. We propose new methods for constructing prediction intervals within the two-stage model framework based on an inferential model (IM). The IM method generates prediction intervals that are guaranteed valid for any sample size. Simulation experiments suggest variations of the IM method that are both valid and efficient, a major improvement over existing methods. We illustrate the use of the IM method using two agricultural data sets, including an on-farm study where the IM-based prediction intervals suggest a higher level of uncertainty in farm-specific effects compared to the standard Student-$t$ based intervals, which are not valid.
翻译:在应用线性混合效应模型时,实验者往往希望对随机数量进行不确定的量化,如对未观察到的个人或群体预测的治疗效果;例如,考虑农业实验,衡量对接受不同治疗和居住在不同农场的动物的反应;农民决定是否采用这种治疗,最感兴趣的是农场一级的不确定性量化,例如,新农场所预测的合理治疗效应的范围;两阶段线性混合效应模型常常用来模拟这类数据;但是,线性混合模型预测的标准技术并不产生校准的不确定性量化;一般而言,实践中使用的预测间隔并不有效 -- -- 它们没有达到或超过其名义覆盖水平的重复采样。我们根据一种推断模型(IM)提出在两阶段模型框架内建立预测间隔的新方法。IM方法产生预测间隔,保证对任何样本规模都有效。模拟实验表明,基于线性混合模型方法的变异性,比现有方法大改进。我们用两种农业数据集来说明IM方法的使用情况,包括不达到或超过其名义覆盖水平的重复采样。我们提议,在两阶段模型模型模型中,根据一种基于IM标准测测距法的测测测测测测结果,其中的IM标准测测测为不至IM结果的测测测测测测测测测测测测测。