The forecasting of political, economic, and public health indicators using internet activity has demonstrated mixed results. For example, while some measures of explicitly surveyed public opinion correlate well with social media proxies, the opportunity for profitable investment strategies to be driven solely by sentiment extracted from social media appears to have expired. Nevertheless, the internet's space of potentially predictive input signals is combinatorially vast and will continue to invite careful exploration. Here, we combine unemployment related search data from Google Trends with economic language on Twitter to attempt to nowcast and forecast: 1. State and national unemployment claims for the US, and 2. Consumer confidence in G7 countries. Building off of a recently developed search-query-based model, we show that incorporating Twitter data improves forecasting of unemployment claims, while the original method remains marginally better at nowcasting. Enriching the input signal with temporal statistical features (e.g., moving average and rate of change) further reduces errors, and improves the predictive utility of the proposed method when applied to other economic indices, such as consumer confidence.
翻译:利用互联网活动对政治、经济和公共卫生指标的预测结果好坏参半。 例如,虽然一些明确调查的公众舆论措施与社交媒体代理关系密切,但完全由社交媒体的情绪驱动有利可图的投资战略的机会似乎已经消失。然而,互联网潜在预测输入信号的空间是广的,将继续进行仔细探索。在这里,我们把谷歌趋势中与失业有关的搜索数据与推特上的经济语言结合起来,试图现在和预测:1. 美国的州和国家失业索赔,2. G7 国家的消费者信心。 利用最近开发的基于搜索查询的模式,我们表明,采用Twitter数据可以改善失业索赔预测,而最初的方法在现在的预测方面仍然略好一些。 用时间统计特征(如移动平均和变化率)来充实输入信号,进一步减少错误,提高拟议方法在应用其他经济指数(如消费者信心)时的预测效用。