Mining the spatial and temporal correlation of wind farm output data is beneficial for enhancing the precision of ultra-short-term wind power prediction. However, if the wind farms are owned by separate entities, they may be reluctant to share their data directly due to privacy concerns as well as business management regulation policies. Although cryptographic approaches have been designed to protect privacy in the process of data sharing, it is still a challenging problem to encrypt the original data while extracting the nonlinear relationship among multiple wind farms in the machine learning process. This paper presents pwXGBoost, a technique based on the machine learning tree model and secure multi-party computation (SMPC) that can successfully extract complicated relationships while preserving data privacy. A maximum mean discrepancy (MMD) based scheme is proposed to effectively choose adjacent candidate wind farms to participate in the collaborative model training, therefore improving the accuracy and reducing the burden of data acquisition. The proposed method was evaluated on real world data collected from a cluster of wind farms in Inner Mongolia, China, demonstrating that it is capable of achieving considerable efficiency and performance improvements while preserving privacy
翻译:挖掘风力农场产出数据的空间和时间相关性,有利于提高超短期风力预测的精确度,但是,如果风力农场由不同的实体拥有,它们可能不愿意直接分享数据,因为隐私问题和商业管理监管政策。虽然在数据共享过程中,加密方法的目的是保护隐私,但在数据共享过程中,加密原始数据,同时在机器学习过程中提取多风力农场之间的非线性关系,仍然是一个具有挑战性的问题。本文介绍了基于机器学习树模型和安全的多方计算(SMPC)的技术PwXGBoost,该技术能够成功地提取复杂的关系,同时维护数据隐私。提议基于最大平均值的差异(MD)办法,以有效选择邻近的候选风力农场参加合作模式培训,从而提高数据的准确性,减轻数据获取负担。对从中国内蒙古的一批风力农场收集到的真实世界数据进行了评价,表明该方法能够实现相当大的效率和绩效改进,同时维护隐私。