There are two distinct definitions of 'P-value' for evaluating a proposed hypothesis or model for the process generating an observed dataset. The original definition starts with a measure of the divergence of the dataset from what was expected under the model, such as a sum of squares or a deviance statistic. A P-value is then the ordinal location of the measure in a reference distribution computed from the model and the data, and is treated as a unit-scaled index of compatibility between the data and the model. In the other definition, a P-value is a random variable on the unit interval whose realizations can be compared to a cutoff alpha to generate a decision rule with known error rates under the model and specific alternatives. It is commonly assumed that realizations of such decision P-values always correspond to divergence P-values. But this need not be so: Decision P-values can violate intuitive single-sample coherence criteria where divergence P-values do not. It is thus argued that divergence and decision P-values should be carefully distinguished in teaching, and that divergence P-values are the relevant choice when the analysis goal is to summarize evidence rather than implement a decision rule.
翻译:P值是用于评价拟议假设或模型的“P值”的两种不同定义,用于评价生成已观察到的数据集的过程。原始定义首先衡量数据集与模型预期值的差异,例如平方或偏差统计之和。P值就是根据模型和数据计算的参考分布中该计量的交点位置,并被视为数据和模型之间兼容性的单位尺度指数。在另一个定义中,P值是单位间隔的随机变量,其实现可与极限阿尔法相比较,以生成一个在模型和具体替代方法下已知误差率的决定规则。通常认为,在分析目标为总结证据而不是执行决定规则时,实现这种决定P值总是与差异P值相对应。但不必这样:决定P值可能违反直观性的单项一致性标准,而差异P值则不会发生差异。因此,在另一个定义中认为,差异和决定P值在教学中应谨慎区分,而差异P值是相关的选择。