The analysis of multidimensional data is becoming a more and more relevant topic in statistical and machine learning research. Given their complexity, such data objects are usually reshaped into matrices or vectors and then analysed. However, this methodology presents several drawbacks. First of all, it destroys the intrinsic interconnections among datapoints in the multidimensional space and, secondly, the number of parameters to be estimated in a model increases exponentially. We develop a model that overcomes such drawbacks. In particular, in this paper, we propose a parsimonious tensor regression model that retains the intrinsic multidimensional structure of the dataset. Tucker structure is employed to achieve parsimony and a shrinkage penalization is introduced to deal with over-fitting and collinearity. To estimate the model parameters, an Alternating Least Squares algorithm is developed. In order to validate the model performance and robustness, a simulation exercise is produced. Moreover, we perform an empirical analysis that highlight the forecasting power of the model with respect to benchmark models. This is achieved by implementing an autoregressive specification on the Foursquares spatio-temporal dataset together with a macroeconomic panel dataset. Overall, the proposed model is able to outperform benchmark models present in the forecasting literature.
翻译:对多层面数据的分析正在成为统计和机器学习研究中越来越具有相关性的专题。鉴于这些数据对象的复杂性,它们通常被重塑为矩阵或矢量,然后加以分析。然而,这一方法提出了若干缺点。首先,它摧毁了多层面空间各数据点之间的内在相互联系,其次,在模型中估计的参数数成倍增长。我们开发了一个模型,克服了这种缺点。特别是,在本文件中,我们提议了一个微妙的拖后退模型,保留数据集的内在多层面结构。塔克结构用于实现离析,并引入了缩缩式惩罚,以处理过于匹配和相近性。为了估计模型参数,制定了一个对最小方方位的算法。为了验证模型的性能和稳健性,我们制作了一个模拟演算。此外,我们进行了一项实验性分析,以突出模型在基准模型方面的预测能力。这是通过对四方基点的模型进行自动递增性规格来实现的。塔克结构被用来实现离析,采用缩式惩罚,并采用缩式惩罚法处理过宽和相近的对比性。为了估算模型的模型,在宏观经济面面面图中进行预测。