The paper describes the use of Bayesian regression for building time series models and stacking different predictive models for time series. Using Bayesian regression for time series modeling with nonlinear trend was analyzed. This approach makes it possible to estimate an uncertainty of time series prediction and calculate value at risk characteristics. A hierarchical model for time series using Bayesian regression has been considered. In this approach, one set of parameters is the same for all data samples, other parameters can be different for different groups of data samples. Such an approach allows using this model in the case of short historical data for specified time series, e.g. in the case of new stores or new products in the sales prediction problem. In the study of predictive models stacking, the models ARIMA, Neural Network, Random Forest, Extra Tree were used for the prediction on the first level of model ensemble. On the second level, time series predictions of these models on the validation set were used for stacking by Bayesian regression. This approach gives distributions for regression coefficients of these models. It makes it possible to estimate the uncertainty contributed by each model to stacking result. The information about these distributions allows us to select an optimal set of stacking models, taking into account the domain knowledge. The probabilistic approach for stacking predictive models allows us to make risk assessment for the predictions that are important in a decision-making process.
翻译:本文描述了Bayesian回归模型用于建立时间序列模型和为时间序列堆叠不同预测模型的使用情况。 分析了Bayesian回归模型用于使用非线性趋势进行时间序列模型模型。 这种方法可以估计时间序列预测的不确定性, 并计算风险特性的值。 考虑了使用Bayesian回归模型的时间序列等级模型。 在这个方法中, 对所有数据样本都使用一套相同的参数, 不同数据样本组使用其他参数。 这种方法允许在特定时间序列的简短历史数据中使用这一模型, 例如, 在销售预测问题中的新商店或新产品中使用这一模型。 在预测模型堆放模型的研究中, 模型ARIMA、 Neal网络、 随机森林、 额外树 被用于模型元数第一层的预测。 在第二个方法中, Bayesian回归模型组使用这些模型的时间序列预测用于叠叠叠。 这种方法为这些模型的回归系数的分布, 使得有可能估计每个模型的不确定性, 模型对堆叠式预测结果进行估算。 允许这些模型的分布用于堆叠风险评估的模型的计算。