Common reporting styles for statistical results, such as $p$-values and confidence intervals (CI), have been reported to be prone to dichotomous interpretations, especially with respect to null hypothesis testing frameworks. For example, when the $p$-value is small enough or the CIs of the mean effects of a studied drug and a placebo are not overlapping, scientists tend to claim significant differences while often disregarding the magnitudes and absolute differences in the effect sizes. Techniques relying on the visual estimation of the strength of evidence have been recommended to reduce such dichotomous interpretations but their effectiveness has also been challenged. We ran two experiments to compare several alternative representations of confidence intervals and used Bayesian multilevel models to estimate the effects of the representation styles on differences in subjective confidence in the results. We also asked the respondents' opinions and preferences in representation styles. Our results suggest that adding visual information to classic CI representation can decrease the tendency towards dichotomous interpretations $-$ measured as the "cliff effect": the sudden drop in confidence around $p$-value 0.05 $-$ compared with classic CI visualization and textual representation of the CI with $p$-values. As a contribution to open science, our data and all analyses are publicly available at https://github.com/helske/statvis .
翻译:据报,统计结果的通用报告风格,如美元价值和信任间隔(CI),容易发生分解解释的情况,特别是在无效假设测试框架方面;例如,当美元价值足够小,或研究药物和安慰剂的平均值影响不重叠时,科学家往往声称存在重大差异,而往往忽视影响大小的大小和绝对差异;建议采用对证据强度的直观估计技术,以减少这种分解解释,但效力也受到挑战;我们进行了两次试验,比较若干替代的信任间隔表示,并使用巴伊西亚多级模型估计代表方式对结果主观信任度差异的影响;我们还询问受访者在代表性样式方面的意见和偏好;我们的结果显示,将视觉信息添加到典型的CI代表面上可以减少对“显微效果”的偏差解释趋势:与典型的CIC直观/直观/直观分析相比,信任度在美元价值0.05美元上突然下降;对CIS/直观/直观科学的文本分析,所有可得到的CIA/直观/直观/直观/直观分析。