Multivariate distributional forecasts have become widespread in recent years. To assess the quality of such forecasts, suitable evaluation methods are needed. In the univariate case, calibration tests based on the probability integral transform (PIT) are routinely used. However, multivariate extensions of PIT-based calibration tests face various challenges. We therefore introduce a general framework for calibration testing in the multivariate case and propose two new tests that arise from it. Both approaches use proper scoring rules and are simple to implement even in large dimensions. The first employs the PIT of the score. The second is based on comparing the expected performance of the forecast distribution (i.e., the expected score) to its actual performance based on realized observations (i.e., the realized score). The tests have good size and power properties in simulations and solve various problems of existing tests. We apply the new tests to forecast distributions for macroeconomic and financial time series data.
翻译:近年来,多变量分布预测变得十分广泛。为了评估这种预测的质量,需要适当的评价方法。在单轨情况下,经常使用基于概率整体变异的校准测试。然而,基于概率整体变异的PIT校准测试的多变量扩展面临各种挑战。因此,我们在多变量案例中引入了校准测试的一般框架,并提出了由此产生的两个新测试。两种方法都使用适当的评分规则,即使大尺度也易于执行。第一个方法使用得分的PIT。第二个方法基于对预测分布的预期业绩(即预期得分)与实际业绩(即已实现得分)的比较。测试在模拟中具有良好的大小和功率特性,并解决了现有测试的各种问题。我们用新的测试来预测宏观经济和财务时间序列数据的分布。