PV power forecasting models are predominantly based on machine learning algorithms which do not provide any insight into or explanation about their predictions (black boxes). Therefore, their direct implementation in environments where transparency is required, and the trust associated with their predictions may be questioned. To this end, we propose a two stage probabilistic forecasting framework able to generate highly accurate, reliable, and sharp forecasts yet offering full transparency on both the point forecasts and the prediction intervals (PIs). In the first stage, we exploit natural gradient boosting (NGBoost) for yielding probabilistic forecasts, while in the second stage, we calculate the Shapley additive explanation (SHAP) values in order to fully comprehend why a prediction was made. To highlight the performance and the applicability of the proposed framework, real data from two PV parks located in Southern Germany are employed. Comparative results with two state-of-the-art algorithms, namely Gaussian process and lower upper bound estimation, manifest a significant increase in the point forecast accuracy and in the overall probabilistic performance. Most importantly, a detailed analysis of the model's complex nonlinear relationships and interaction effects between the various features is presented. This allows interpreting the model, identifying some learned physical properties, explaining individual predictions, reducing the computational requirements for the training without jeopardizing the model accuracy, detecting possible bugs, and gaining trust in the model. Finally, we conclude that the model was able to develop complex nonlinear relationships which follow known physical properties as well as human logic and intuition.
翻译:光电效应预测模型主要基于机算学习算法,这些算法无法提供对其预测的洞察力或解释(黑盒),因此,在需要透明度的环境中直接实施这些算法,而其预测的相关信任度可能受到质疑。为此,我们提议采用一个两个阶段的概率预测框架,能够产生高度准确、可靠和精确的预测,同时在点预报和预测间隔(PIS)方面提供完全的透明度。在第一阶段,我们利用自然梯度加速(NGBoost)来得出预测概率预测,而在第二阶段,我们计算精度添加解释(SHAP)值,以便充分理解作出预测的原因。为了突出拟议框架的性能和适用性,我们建议采用来自位于南德国的两个光电效应园的实际数据。比较结果显示两种最先进的算法,即高斯进程和下限估计法,显示点预测准确性和总体概率性业绩的显著提高。最重要的是,我们详细分析模型的复杂非线性解释性解释(SHAP)解释模型的非直径)解释为什么。这个模型的精确性关系和互动作用,我们最后解释了各种精确性特征,我们所了解的精确性分析的模型的计算结果,从而了解了各种精确性能,我们所了解的计算结果,从而可以理解了模型的精确性能。