The development of a knowledge repository for climate science data is a multidisciplinary effort between the domain experts (climate scientists), data engineers whos skills include design and building a knowledge repository, and machine learning researchers who provide expertise on data preparation tasks such as gap filling and advise on different machine learning models that can exploit this data. One of the main goals of the CA20108 cost action is to develop a knowledge portal that is fully compliant with the FAIR principles for scientific data management. In the first year, a bespoke knowledge portal was developed to capture metadata for FAIR datasets. Its purpose was to provide detailed metadata descriptions for shareable \micro data using the WMO standard. While storing Network, Site and Sensor metadata locally, the system passes the actual data to Zenodo, receives back the DOI and thus, creates a permanent link between the Knowledge Portal and the storage platform Zenodo. While the user searches the Knowledge portal (metadata), results provide both detailed descriptions and links to data on the Zenodo platform.
翻译:本文的主要目的是开发一个符合FAIR原则的,面向气候科学数据的知识库,并实现该知识库的一站式查询和检索功能。为了实现该目的,我们需要涵盖领域专家(气候学家)、数据工程师和机器学习研究人员等多个学科领域的交叉合作。其中,领域专家负责提供具有代表性的气候科学数据,数据工程师则负责建立具有高效查询和检索功能的知识库,而机器学习研究人员则为数据处理提供技术支持,如缺失值填补和选择合适的机器学习模型等。因此,本文所述的CA20108合作行动旨在开发一个符合FAIR原则的气候数据知识门户网站,并为提供高质量、可靠的气候科学数据做出贡献。
在第一年,我们开发了一个特定的知识门户网站,用于采集达到FAIR原则的数据集的元数据。该网站旨在通过收集符合WMO标准的各个组分的详细元数据说明,帮助存储和共享微观数据。该系统本地存储了网络、站点和传感器的元数据,并将实际数据传递给Zenodo管理数据存储。与此同时,该系统会根据Zenodo返回的DOI信息,在知识门户网站和Zenodo存储平台之间创建永久链接。用户可在知识门户网站搜索元数据,并查看数据的详细说明和Zenodo平台上的数据链接。