The current knowledge system of macroeconomics is built on interactions among a small number of variables, since traditional macroeconomic models can mostly handle a handful of inputs. Recent work using big data suggests that a much larger number of variables are active in driving the dynamics of the aggregate economy. In this paper, we introduce a knowledge graph (KG) that consists of not only linkages between traditional economic variables but also new alternative big data variables. We extract these new variables and the linkages by applying advanced natural language processing (NLP) tools on the massive textual data of academic literature and research reports. As one example of the potential applications, we use it as the prior knowledge to select variables for economic forecasting models in macroeconomics. Compared to statistical variable selection methods, KG-based methods achieve significantly higher forecasting accuracy, especially for long run forecasts.
翻译:目前的宏观经济知识体系建立在少数变量之间的相互作用之上,因为传统的宏观经济模型大多可以处理少数投入。最近使用大数据的工作表明,在推动总体经济动态方面,更多的变量是活跃的。在本文中,我们引入了一个知识图(KG),它不仅包括传统经济变量之间的联系,而且还包括新的替代大数据变量之间的联系。我们通过在学术文献和研究报告的大量文字数据中应用先进的自然语言处理工具(NLP)来提取这些新的变量和联系。作为潜在应用的一个例子,我们利用它作为以前的知识来选择宏观经济经济预测模型的变量。与统计变量选择方法相比,基于KG的方法可以大大提高预测的准确性,特别是对长期预测的准确性。