The relatedness between a country or a firm and a product is a measure of the feasibility of that economic activity. As such, it is a driver for investments at a private and institutional level. Traditionally, relatedness is measured using networks derived by country-level co-occurrences of product pairs, that is counting how many countries export both. In this work, we compare networks and machine learning algorithms trained not only on country-level data, but also on firms, that is something not much studied due to the low availability of firm-level data. We quantitatively compare the different measures of relatedness, by using them to forecast the exports at the country and firm-level, assuming that more related products have a higher likelihood to be exported in the future. Our results show that relatedness is scale-dependent: the best assessments are obtained by using machine learning on the same typology of data one wants to predict. Moreover, we found that while relatedness measures based on country data are not suitable for firms, firm-level data are very informative also for the development of countries. In this sense, models built on firm data provide a better assessment of relatedness. We also discuss the effect of using parameter optimization and community detection algorithms to identify clusters of related companies and products, finding that a partition into a higher number of blocks decreases the computational time while maintaining a prediction performance well above the network-based benchmarks.
翻译:一个国家或公司与产品之间的联系是衡量这种经济活动可行性的一个尺度。因此,它是私人和机构一级投资的驱动力。传统上,关联性是通过国家一级产品对对口共同发生的网络来衡量的,这算出有多少国家同时出口。在这项工作中,我们比较不仅在国家一级数据方面受过训练的网络和机器学习算法,而且对公司也受过训练,由于公司一级数据很少,这一点研究得不多。我们从数量上比较不同的关联性计量法,利用这些计量法预测国家和公司一级的出口,假设更多的相关产品今后出口的可能性更大。我们的结果显示,关联性取决于规模:最佳评估是通过对所要预测的数据的同一类型进行机器学习获得的。此外,我们发现,虽然以国家数据为基础的相关计量法不适合于公司,但公司一级数据也非常有助于国家的发展。从这个意义上讲,通过使用公司数据预测国家和公司一级的出口量来更好地评估关联性,假设更多相关产品今后出口的可能性。我们的结果表明,关联性取决于规模:最佳评估是通过对所要预测的数据的同一类型进行机器学习而获得的。此外,我们发现,虽然基于国家数据的相关计量尺度衡量标准也非常有益,但基于国家发展的国家数据的数据也提供了更好的评估。我们还讨论利用公司一级数据对公司一级数据进行业绩分析的升级的计算的结果。