In the literature about web survey methodology, significant efforts have been made to understand the role of time-invariant factors (e.g. gender, education and marital status) in (non-)response mechanisms. Time-invariant factors alone, however, cannot account for most variations in (non-)responses, especially fluctuations of response rates over time. This observation inspires us to investigate the counterpart of time-invariant factors, namely time-varying factors and the potential role they play in web survey (non-)response. Specifically, we study the effects of time, weather and societal trends (derived from Google Trends data) on the daily (non-)response patterns of the 2016 and 2017 Dutch Health Surveys. Using discrete-time survival analysis, we find, among others, that weekends, holidays, pleasant weather, disease outbreaks and terrorism salience are associated with fewer responses. Furthermore, we show that using these variables alone achieves satisfactory prediction accuracy of both daily and cumulative response rates when the trained model is applied to future unseen data. This approach has the further benefit of requiring only non-personal contextual information and thus involving no privacy issues. We discuss the implications of the study for survey research and data collection.
翻译:在关于网络调查方法的文献中,为了解时间差异因素(如性别、教育和婚姻状况)在(非)反应机制中的作用作出了重大努力;然而,仅时间差异因素不能说明(非)反应中的大多数差异,特别是答复率的随时间波动。这一观察激励我们调查时间差异因素的对应因素,即时间差异因素及其在网络调查(非)反应中的潜在作用。具体地说,我们研究了时间、天气和社会趋势(根据谷歌趋势数据)对2016年和2017年荷兰健康调查的每日(非)反应模式的影响。我们利用离散时间生存分析发现,除其他外,周末、节假日、愉快的天气、疾病爆发和恐怖主义的显著影响与较少的答复有关。此外,我们表明,仅使用这些变量就能在将经过培训的模型应用于未来的无形数据时,对每日和累积反应率的准确性作出令人满意的预测。这一方法进一步的好处是,只需要非个人背景信息,因此不涉及隐私问题。我们讨论了研究所涉的数据收集问题。