Wind turbine power curve models translate ambient conditions into turbine power output. They are essential for energy yield prediction and turbine performance monitoring. In recent years, data-driven machine learning methods have outperformed parametric, physics-informed approaches. However, they are often criticised for being opaque "black boxes" which raises concerns regarding their robustness in non-stationary environments, such as faced by wind turbines. We, therefore, introduce an explainable artificial intelligence (XAI) framework to investigate and validate strategies learned by data-driven power curve models from operational SCADA data. It combines domain-specific considerations with Shapley Values and the latest findings from XAI for regression. Our results suggest, that learned strategies can be better indicators for model robustness than validation or test set errors. Moreover, we observe that highly complex, state-of-the-art ML models are prone to learn physically implausible strategies. Consequently, we compare several measures to ensure physically reasonable model behaviour. Lastly, we propose the utilization of XAI in the context of wind turbine performance monitoring, by disentangling environmental and technical effects that cause deviations from an expected turbine output. We hope, our work can guide domain experts towards training and selecting more transparent and robust data-driven wind turbine power curve models.
翻译:风力涡轮机功率曲线模型将环境条件转化成涡轮机功率输出,是能量产量预测和涡轮机性能监测的关键。近年来,数据驱动的机器学习方法已经超越了参数和物理驱动方法。然而,由于它们往往是不透明的“黑盒子”,因此引起了对其在非稳态环境下(例如风力涡轮机所面临的环境)的稳健性的担忧。因此,我们介绍了一种可解释的人工智能(XAI)框架,来研究和验证从操作SCADA数据中学习到的数据驱动功率曲线模型的策略。它将特定领域的考虑因素进行了结合,并采用了Shapley Values和最新的XAI回归方法。我们的研究结果表明,学习到的策略可以更好地指示模型的稳健性,而不是基于验证或测试集的误差。此外,我们观察到,高度复杂的、最先进的机器学习模型容易学习到不符合物理条件的策略。因此,我们比较了几种措施,以确保物理上合理的模型行为。最后,我们提出了在风力涡轮机性能监测的背景下利用XAI的方法,通过分离导致涡轮机输出与期望值偏差的环境和技术效应。我们希望我们的工作能指导领域专家培训和选择更透明和稳健的数据驱动风力涡轮机功率曲线模型。