Forecasting tourism demand has important implications for both policy makers and companies operating in the tourism industry. In this research, we applied methods and tools of social network and semantic analysis to study user-generated content retrieved from online communities which interacted on the TripAdvisor travel forum. We analyzed the forums of 7 major European capital cities, over a period of 10 years, collecting more than 2,660,000 posts, written by about 147,000 users. We present a new methodology of analysis of tourism-related big data and a set of variables which could be integrated into traditional forecasting models. We implemented Factor Augmented Autoregressive and Bridge models with social network and semantic variables which often led to a better forecasting performance than univariate models and models based on Google Trend data. Forum language complexity and the centralization of the communication network, i.e. the presence of eminent contributors, were the variables that contributed more to the forecasting of international airport arrivals.
翻译:在这项研究中,我们采用了社会网络和语义分析的方法和工具,以研究从在线社区检索的用户生成的内容,这些内容在TripAdvisor旅行论坛上进行了互动。我们分析了欧洲7个主要首府城市的论坛,在10年期间收集了约147 000名用户撰写的2 660 000多篇文章。我们提出了分析与旅游有关的大数据的新方法和一套可纳入传统预测模型的变量。我们采用了与社会网络和语义变量有关的因数递增自动递增和桥梁模型,这些模型往往比基于Google Trend数据的独家模型和模型更能导致预测性能。论坛语言复杂性和通信网络的集中化,即知名贡献者的存在,是更有助于预测国际机场抵达者的变量。