Bayesian linear mixed-effects models and Bayesian ANOVA are increasingly being used in the cognitive sciences to perform null hypothesis tests, where a null hypothesis that an effect is zero is compared with an alternative hypothesis that the effect exists and is different from zero. While software tools for Bayes factor null hypothesis tests are easily accessible, how to specify the data and the model correctly is often not clear. In Bayesian approaches, many authors use data aggregation at the by-subject level and estimate Bayes factors on aggregated data. Here, we use simulation-based calibration for model inference applied to several example experimental designs to demonstrate that, as with frequentist analysis, such null hypothesis tests on aggregated data can be problematic in Bayesian analysis. Specifically, when random slope variances differ (i.e., violated sphericity assumption), Bayes factors are too conservative for contrasts where the variance is small and they are too liberal for contrasts where the variance is large. Running Bayesian ANOVA on aggregated data can - if the sphericity assumption is violated - likewise lead to biased Bayes factor results. Moreover, Bayes factors for by-subject aggregated data are biased (too liberal) when random item slope variance is present but ignored in the analysis. These problems can be circumvented or reduced by running Bayesian linear mixed-effects models on non-aggregated data such as on individual trials, and by explicitly modeling the full random effects structure. Reproducible code is available from https://osf.io/mjf47/.
翻译:在认知科学中越来越多地使用Bayes47线性混合效应模型和Bayesian ANOVA模型和Bayesian ANOVA模型来进行无效假设测试,在这种假设中,将效果为零的无效假设与影响存在和与零不同的替代假设进行比较。虽然Bayes系数无效假设的软件工具容易获得,但如何指定数据和模型正确性往往不十分清楚。在Bayesian方法中,许多作者使用按主题水平进行的数据汇总,并在汇总数据中估算Bayesian ANOVA系数。在这里,我们使用基于模拟的校准模型来推断一些实验性设计,以证明,如经常分析,对综合数据进行这种无效假设性测试,在Bayesian分析中,如果随机的斜坡度差异不同(即违反球度假设),那么Bayes比率因素对于差异较小、差异较大时,则过于保守。