Terabytes of data are collected every day by wind turbine manufacturers from their fleets. The data contain valuable real-time information for turbine health diagnostics and performance monitoring, for predicting rare failures and the remaining service life of critical parts. And yet, this wealth of data from wind turbine fleets remains inaccessible to operators, utility companies, and researchers as manufacturing companies prefer the privacy of their fleets' turbine data for business strategic reasons. The lack of data access impedes the exploitation of opportunities, such as improving data-driven turbine operation and maintenance strategies and reducing downtimes. We present a distributed federated machine learning approach that leaves the data on the wind turbines to preserve the data privacy, as desired by manufacturers, while still enabling fleet-wide learning on those local data. We demonstrate in a case study that wind turbines which are scarce in representative training data benefit from more accurate fault detection models with federated learning, while no turbine experiences a loss in model performance by participating in the federated learning process. When comparing conventional and federated training processes, the average model training time rises significantly by a factor of 7 in the federated training due to increased communication and overhead operations. Thus, model training times might constitute an impediment that needs to be further explored and alleviated in federated learning applications, especially for large wind turbine fleets.
翻译:风轮机制造商每天从其车队中收集泰比特的数据,这些数据含有用于涡轮健康诊断和性能监测的宝贵实时信息,用于预测罕见的故障和关键部件的剩余使用寿命。然而,运营商、公用事业公司和研究人员仍然无法获得来自风轮机机群的大量数据,因为制造公司出于商业战略原因,更希望其机队的涡轮数据的隐私;缺乏数据访问阻碍了机会的利用,如改进数据驱动涡轮机的运行和维护战略以及减少故障时间等。我们介绍了一种分布式联动机学习方法,按照制造商的要求,将风轮机的数据留给维护数据隐私,同时仍然能够使全车队了解这些当地数据。我们通过案例研究表明,在代表性培训数据方面稀缺的风轮机机机因更准确的故障检测模型和热学学习而获益,而没有涡轮机参与热化学习过程,因此在模型学习过程中会损失模型性能。比较常规和节能培训过程时,平均模式培训时间大幅增加,在供货轮培训过程中将7个因素保留数据隐私,以便保持数据隐私,同时使全机队了解这些当地数据。因此可能需要进行大规模学习。