Distributional regression is extended to Gaussian response vectors of dimension greater than two by parameterizing the covariance matrix $\Sigma$ of the response distribution using the entries of its Cholesky decomposition. The more common variance-correlation parameterization limits such regressions to bivariate responses -- higher dimensions require complicated constraints among the correlations to ensure positive definite $\Sigma$ and a well-defined probability density function. In contrast, Cholesky-based parameterizations ensure positive definiteness for all distributional dimensions no matter what values the parameters take, enabling estimation and regularization as for other distributional regression models. In cases where components of the response vector are assumed to be conditionally independent beyond a certain lag $r$, model complexity can be further reduced by setting Cholesky parameters beyond this lag to zero a priori. Cholesky-based multivariate Gaussian regression is first illustrated and assessed on artificial data and subsequently applied to a real-world 10-dimensional weather forecasting problem. There the regression is used to obtain reliable joint probabilities of temperature across ten future times, leveraging temporal correlations over the prediction period to obtain more precise and meteorologically consistent probabilistic forecasts.
翻译:将分布回归扩展至高斯的维度大于两个的响应矢量,方法是使用Chalesky分解的条目,对响应分布的共差矩阵参数参数进行参数化,以美元=Sigma$为基数;比较常见的差异-正差参数化参数化将这种回归限制为双差反应 -- -- 更高维度要求相关关系之间复杂的限制,以确保正确定美元和明确界定的概率密度功能。相比之下,基于Choolesky的参数化参数化确保所有分布维度的正确定性,而无论参数的值为何,能够与其他分布回归模型的模型一样进行估计和规范化。如果假设响应矢量的构成部分有条件地独立,超过一定的滞后值,则模型复杂性可以进一步降低,将Choolesky参数设置为双差值后为零,首先对人造数据进行说明和评估,然后应用于现实世界的10度天气预报问题。在这样的情况下,使用回归法来获得可靠的联合温度的可靠联合概率,在预测期间利用时间相关性,以获得更精确和一致的预测性预报。