Idealized probability distributions, such as normal or other curves, lie at the root of confirmatory statistical tests. But how well do people understand these idealized curves? In practical terms, does the human visual system allow us to match sample data distributions with hypothesized population distributions from which those samples might have been drawn? And how do different visualization techniques impact this capability? This paper shares the results of a crowdsourced experiment that tested the ability of respondents to fit normal curves to four different data distribution visualizations: bar histograms, dotplot histograms, strip plots, and boxplots. We find that the crowd can estimate the center (mean) of a distribution with some success and little bias. We also find that people generally overestimate the standard deviation, which we dub the "umbrella effect" because people tend to want to cover the whole distribution using the curve, as if sheltering it from the heavens above, and that strip plots yield the best accuracy.
翻译:理想的概率分布( 如正常曲线或其他曲线) 位于确认性统计测试的根部。 但是人们如何理解这些理想化曲线? 实际上, 人类视觉系统是否允许我们将样本数据分布与这些样本可能来自的假设人口分布相匹配? 不同的视觉化技术如何影响这种能力? 本文分享了来自众人的实验结果, 该实验测试了被调查者将正常曲线与四种不同的数据分布视觉化的能力: 条形图、 点图直方图、 条形图和拳击图。 我们发现, 人群可以估计分布的中心( 平均值), 某些成功和微小偏差。 我们还发现, 人们普遍高估了标准偏差, 我们称之为“ 伞效应” 的标准偏差, 因为人们倾向于使用曲线来覆盖整个分布, 比如将它从上面的天空遮盖起来, 而条形图的精确度最高。