Bayes factors for composite hypotheses have difficulty in encoding vague prior knowledge, leading to conflicts between objectivity and sensitivity including the Jeffreys-Lindley paradox. To address these issues we revisit the posterior Bayes factor, in which the posterior distribution from the data at hand is re-used in the Bayes factor for the same data. We argue that this is biased when calibrated against proper Bayes factors, but propose bias adjustments to allow interpretation on the same scale. In the important case of a regular normal model, the bias in log scale is half the number of parameters. The resulting empirical Bayes factor is closely related to the widely applicable information criterion. We develop test-based empirical Bayes factors for several standard tests and propose an extension to multiple testing closely related to the optimal discovery procedure. When only a P-value is available, such as in non-parametric tests, we obtain a Bayes factor calibration of 10p. We propose interpreting the strength of Bayes factors on a logarithmic scale with base 3.73, reflecting the sharpest distinction between weaker and stronger belief. Empirical Bayes factors are a frequentist-Bayesian compromise expressing an evidential view of hypothesis testing.
翻译:综合假设的贝亚系数难以将先前知识模糊起来,导致客观性和敏感性(包括Jeffris-Lindley悖论)之间的冲突,包括Jeffreys-Lindley悖论。为了解决这些问题,我们重新审视了后贝亚系数,即巴伊斯系数对同一数据重新使用手头数据中的后方分布。我们争辩说,根据适当的贝亚系数进行校准时,这是有偏差的,但建议作出偏差调整,以便能够在同一尺度上进行解释。在正常正常模型的重要案例中,日志比例的偏差是参数数目的一半。由此产生的实证贝亚系数与广泛适用的信息标准密切相关。我们为若干标准测试开发基于试验的经验贝亚系数,并提议扩展与最佳发现程序密切相关的多次测试。当只有P值(如非参数测试)可用时,我们得到一个10p的基系数校准。我们提议用3.73基数来解释基数的贝亚系数的强度,以反映较弱和较强的信念之间的最鲜明区别。Empicical-Bayes系数是一种经常的妥协假设。