With historic misses in the 2016 and 2020 US Presidential elections, interest in measuring polling errors has increased. The most common method for measuring directional errors and non-sampling excess variability during a postmortem for an election is by assessing the difference between the poll result and election result for polls conducted within a few days of the day of the election. Analyzing such polling error data is notoriously difficult with typical models being extremely sensitive to the time between the poll and the election. We leverage hidden Markov models traditionally used for election forecasting to flexibly capture time-varying preferences and treat the election result as a peak at the typically hidden Markovian process. Our results are much less sensitive to the choice of time window, avoid conflating shifting preferences with polling error, and are more interpretable despite a highly flexible model. We demonstrate these results with data on polls from the 2004 through 2020 US Presidential elections and 1992 through 2020 US Senate elections, concluding that previously reported estimates of bias in Presidential elections were too extreme by 10\%, estimated bias in Senatorial elections was too extreme by 25\%, and excess variability estimates were also too large.
翻译:随着2016年和2020年美国总统大选的历史性失误,测量投票错误的兴趣在2016年和2020年美国总统大选中增加。测量方向错误和在选举后验尸过程中非抽样过度变化的最常见方法,是评估投票结果与选举当天几天内进行的投票结果之间的差别。分析这种投票错误数据非常困难,典型模式对投票与选举之间的时间间隔极为敏感。我们利用传统上用于选举预测的隐性马尔科夫模式来灵活地捕捉时间变化的偏好和治疗选举结果,这是典型的隐性马科维亚进程的一个高峰。我们的结果对时间窗口的选择不太敏感,避免将偏好与投票错误混为一谈,尽管模型非常灵活,但更可以解释。我们用2004年至2020年美国总统选举和1992年至2020年美国参议院选举的投票数据来证明这些结果。我们的结论是,先前报告的总统选举中的偏见估计值太过极端了10 ⁇,估计参议员选举中的偏差也太极端了25 ⁇,而且过多的变差估计也太大。