In order to determine whether or not an effect is absent based on a statistical test, the recommended frequentist tool is the equivalence test. Typically, it is expected that an appropriate equivalence margin has been specified before any data are observed. Unfortunately, this can be a difficult task. If the margin is too small, then the test's power will be substantially reduced. If the margin is too large, any claims of equivalence will be meaningless. Moreover, it remains unclear how defining the margin afterwards will bias one's results. In this short article, we consider a series of hypothetical scenarios in which the margin is defined post-hoc or is otherwise considered controversial. We also review a number of relevant, potentially problematic actual studies from clinical trials research, with the aim of motivating a critical discussion as to what is acceptable and desirable in the reporting and interpretation of equivalence tests.
翻译:为了根据统计测试确定是否缺乏某种效果,推荐的常客工具是等值测试。通常,在观察到任何数据之前,预计会指定出适当的等值差值。不幸的是,这可能会是一项困难的任务。如果差值太小,那么试验的力量就会大大缩小。如果差值太大,那么任何等值的主张将是毫无意义的。此外,仍然不清楚随后的差值定义将如何偏向于结果。在这个简短的条款中,我们考虑了一系列假设情况,即差值是界定的,还是被认为具有争议性的。我们还审查了临床试验研究中的一些相关的、可能存在问题的实际研究,目的是促使就等值测试的报告和解释中什么是可接受的和可取的进行批判性讨论。