Bayesian 双间测试 (Bayesian two-interval test)

The null hypothesis test (NHT) is widely used for validating scientific hypotheses but is actually highly criticized. Although Bayesian tests overcome several criticisms, some limits remain. We propose a Bayesian two-interval test (2IT) in which two hypotheses on an effect being present or absent are expressed as prespecified joint or disjoint intervals and their posterior probabilities are computed. The same formalism can be applied for superiority, non-inferiority, or equivalence tests. The 2IT was studied for three real examples and three sets of simulations (comparison of a proportion and a mean to a reference and comparison of two proportions). Several scenarios were created (with different sample sizes), and simulations were conducted to compute the probabilities of the parameter of interest being in the interval corresponding to either hypothesis given the data generated under one of the hypotheses. Posterior estimates were obtained using conjugacy with a low-informative prior. Bias was also estimated. The probability of accepting a hypothesis when that hypothesis is true progressively increases the sample size, tending towards 1, while the probability of accepting the other hypothesis is always very low (less than 5%) and tends towards 0. The speed of convergence varies with the gap between the hypotheses and with their width. In the case of a mean, the bias is low and rapidly becomes negligible. We propose a Bayesian test that follows a scientifically sound process, in which two interval hypotheses are explicitly used and tested. The proposed test has almost none of the limitations of the NHT and suggests new features, such as a rationale for serendipity or a justification for a "trend in data". The conceptual framework of the 2-IT also allows the calculation of a sample size and the use of sequential methods in numerous contexts.

翻译：无效假设测试(NHT)被广泛用于验证科学假设,但实际上却受到高度批评。虽然贝叶斯测试克服了几个批评,但仍有一些限制。我们提议进行巴伊西亚双对间测试(2IT),其中两个关于存在或不存在的影响的假设表现为预设的联合或脱节间隔,并计算出其前后的概率。对优越性、非推断性或等值测试也可以采用同样的形式论。2IT研究的是三个真实的间隔和三套模拟(比例的比较和参考和比较的平均值的比较,两个比例的比较,两个比例的参照和比较),但还存在一些限制。我们提出了几种假设(抽样大小不同),并进行了两个假设的模拟以计算利息参数的概率与假设中的某一间隔相对应,其中一个假设是联合或脱节,另一个假设的数值的概率是不同的,一个假设的概率是,另一个假设的数值是最低的,一个假设的概率是最低的,一个是最低的,一个假设的数值是最低的,一个假设的概率是最低的,一个是最低的,一个假设的概率是最低的,一个是最低的。一个假设的假设的概率的概率是不同的是不同的是,一个假设,一个假设的概率的概率的概率的概率,一个是不同的,一个是不同的是不同的,一个假设,一个是不同的是不同的,一个是不同的是,一个是不同的,一个是,一个假设的,一个是不同的是,一个是,一个是,一个假设的概率的概率的概率的概率的概率的概率的概率的概率的概率的概率的概率的。一个,一个是,一个是不同的是,一个是,一个是,一个是,一个是,一个是不同的是,一个是,一个是,一个是,一个是,一个是不同的,一个是不同的是不同的是不同的是不同的,一个是不同的是不同的是不同的是不同的是不同的。一个是,一个是不同的是不同的是不同的。一个是不同的。一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是,一个是