Scientists often adjust their significance threshold (alpha level) during null hypothesis significance testing in order to take into account multiple testing and multiple comparisons. This alpha adjustment has become particularly relevant in the context of the replication crisis in science. The present article considers the conditions in which this alpha adjustment is appropriate and the conditions in which it is inappropriate. A distinction is drawn between three types of multiple testing: disjunction testing, conjunction testing, and individual testing. It is argued that alpha adjustment is only appropriate in the case of disjunction testing, in which at least one test result must be significant in order to reject the associated joint null hypothesis. Alpha adjustment is inappropriate in the case of conjunction testing, in which all relevant results must be significant in order to reject the joint null hypothesis. Alpha adjustment is also inappropriate in the case of individual testing, in which each individual result must be significant in order to reject each associated individual null hypothesis. The conditions under which each of these three types of multiple testing is warranted are examined. It is concluded that researchers should not automatically (mindlessly) assume that alpha adjustment is necessary during multiple testing. Illustrations are provided in relation to joint studywise hypotheses and joint multiway ANOVAwise hypotheses.
翻译:科学工作者在无效假设意义测试期间经常调整其重要性阈值(阿尔法水平),以考虑到多重测试和多重比较。这一阿尔法调整在科学复制危机的背景下变得特别相关。本条款考虑到这一阿尔法调整是适当的条件和不适当的条件。对三种类型的多重测试作了区分:脱钩测试、交替测试和个人测试。有人认为,甲型调整仅适用于脱钩测试的情况,在这种测试中,至少一个测试结果必须是重大结果,才能拒绝相关的联合无效假设。在结合测试中,阿尔法调整是不适合的,因此所有相关结果都必须具有重大意义,才能拒绝联合无效假设。在单项测试中,阿尔法调整也是不适当的,在单项测试中,每一种结果都必须具有重大意义,才能拒绝每一项相关的个人无效假设。对这三种类型的多重测试中每一种都有必要进行检查的条件。得出的结论是,研究人员不应自动(毫不含糊地)假定在多次测试中有必要进行阿尔法调整。在联合研究假设和联合多维程中提供说明。