Photovoltaic (PV) energy generation plays a crucial role in the energy transition. Small-scale PV installations are deployed at an unprecedented pace, and their integration into the grid can be challenging since public authorities often lack quality data about them. Overhead imagery is increasingly used to improve the knowledge of residential PV installations with machine learning models capable of automatically mapping these installations. However, these models cannot be easily transferred from one region or data source to another due to differences in image acquisition. To address this issue known as domain shift and foster the development of PV array mapping pipelines, we propose a dataset containing aerial images, annotations, and segmentation masks. We provide installation metadata for more than 28,000 installations. We provide ground truth segmentation masks for 13,000 installations, including 7,000 with annotations for two different image providers. Finally, we provide installation metadata that matches the annotation for more than 8,000 installations. Dataset applications include end-to-end PV registry construction, robust PV installations mapping, and analysis of crowdsourced datasets.
翻译:光伏发电在能源转型中发挥着关键作用。小型光伏发电装置的部署速度前所未有,由于公共当局往往缺乏关于这些装置的高质量数据,因此将其纳入电网可能具有挑战性。光伏发电的图像越来越多地用来用能够自动测绘这些装置的机器学习模型改进住宅光伏装置的知识。然而,由于图像获取的差异,这些模型无法轻易地从一个区域或数据源转移到另一个区域或数据源。为了解决这一问题,即所谓的域转移,并促进光伏发电阵阵列绘图管道的发展,我们提议建立一个数据集,其中包含航空图像、说明和隔断面面具。我们为28 000多个装置提供安装元数据。我们为13 000个装置提供地面真象分解掩码,包括两个不同的图像提供者的7 000个说明。最后,我们提供与8 000多个装置的批注相匹配的安装元数据。数据集的应用包括终端到终端的光电站登记册建设、强的光伏装置绘图以及众源数据集分析。