P-hacking occurs when researchers engage in various behaviors that increase their chances of reporting statistically significant results. P-hacking is problematic because it reduces the informativeness of hypothesis tests -- by making significant results much more common than they are supposed to be in the absence of true significance. Despite its prevalence, p-hacking is not taken into account in hypothesis testing theory: the critical values used to determine significance assume no p-hacking. To address this problem, we build a model of p-hacking and use it to construct critical values such that, if these values are used to determine significance, and if researchers adjust their behavior to these new significance standards, then significant results occur with the desired frequency. Because such robust critical values allow for p-hacking, they are larger than classical critical values. As an illustration, we calibrate the model with evidence from the social and medical sciences. We find that the robust critical value for any test is the classical critical value for the same test with one fifth of the significance level -- a form of Bonferroni correction. For instance, for a $z$-test with a significance level of $5\%$, the robust critical value is $2.31$ instead of $1.65$ if the test is one-sided and $2.57$ instead of $1.96$ if the test is two-sided.
翻译:当研究人员从事各种增加其报告具有统计意义的结果的机会的行为时,便会出现Phacking现象。Phacking之所以有问题,是因为它降低了假设测试的信息性,因为这样会降低假设测试的信息性 -- -- 其重要结果比在缺乏真正重要性的情况下他们认为的要多得多。尽管它很普遍,但Phacking并没有在假设测试理论中被考虑在内:用于确定重要性的关键值并不假定Phacking。为了解决这一问题,我们建立了一个Phacking模式,并用它来构建关键值,这样,如果这些值被用来确定重要性,如果研究人员将其行为调整到这些新的意义标准,那么,就会以预期的频率出现重大结果。由于这些强健的关键值使得phacking能够进行p-hack。尽管其普遍性普遍,Phackinging并没有在假设理论中被考虑在内:用于确定重要性的关键值并不代表任何重大值的关键值。我们发现,任何测试的稳健关键值都是同一测试的经典关键值 -- -- 一种是Bonferroni更正。例如,对于具有重要价值的z$的测试, $ $5xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx