强有力使用贝ys系数的工作技术 (Workflow Techniques for the Robust Use of Bayes Factors)

Inferences about hypotheses are ubiquitous in the cognitive sciences. Bayes factors provide one general way to compare different hypotheses by their compatibility with the observed data. Those quantifications can then also be used to choose between hypotheses. While Bayes factors provide an immediate approach to hypothesis testing, they are highly sensitive to details of the data/model assumptions. Moreover it's not clear how straightforwardly this approach can be implemented in practice, and in particular how sensitive it is to the details of the computational implementation. Here, we investigate these questions for Bayes factor analyses in the cognitive sciences. We explain the statistics underlying Bayes factors as a tool for Bayesian inferences and discuss that utility functions are needed for principled decisions on hypotheses. Next, we study how Bayes factors misbehave under different conditions. This includes a study of errors in the estimation of Bayes factors. Importantly, it is unknown whether Bayes factor estimates based on bridge sampling are unbiased for complex analyses. We are the first to use simulation-based calibration as a tool to test the accuracy of Bayes factor estimates. Moreover, we study how stable Bayes factors are against different MCMC draws. We moreover study how Bayes factors depend on variation in the data. We also look at variability of decisions based on Bayes factors and how to optimize decisions using a utility function. We outline a Bayes factor workflow that researchers can use to study whether Bayes factors are robust for their individual analysis, and we illustrate this workflow using an example from the cognitive sciences. We hope that this study will provide a workflow to test the strengths and limitations of Bayes factors as a way to quantify evidence in support of scientific hypotheses. Reproducible code is available from https://osf.io/y354c/.

翻译：有关假设的推论在认知科学中是无处不在的。贝亚因素提供了比较不同假设的一种一般方法, 通过它们与观察到的数据的兼容性来比较不同假设35 。这些量化也可以用来在假设中作出选择。拜亚因素提供了一种即时的假设测试方法, 它们对于数据/ 模型假设的细节非常敏感。此外, 不清楚这种方法在实践中可以如何直截了当地实施, 特别是它对计算实施细节的敏感度。在这里, 我们调查拜亚因素分析中的拜亚因素问题。我们解释拜亚因素背后的统计, 将其作为拜亚科学推断的工具, 并讨论在假设中作出原则性决定时需要使用效用功能。接下来, 我们研究拜亚因素在不同的条件下如何出现错误。重要的是, 不清楚基于桥梁取样的贝亚因素估计是否对复杂分析是公正的。我们首先使用基于模拟的校准工具来测试拜亚因素的精确度系数估计。此外, 我们研究贝亚因素是如何使用一种稳定的数据变异性系数, 利用贝亚因素来研究一种不同的数据变变系数。我们使用一种方法, 利用贝贝亚因素进行一种不同的变变变系数研究。