Recent advances in machine learning have been supported by the emergence of domain-specific software libraries, enabling streamlined workflows and increased reproducibility. For geospatial machine learning (GeoML), the availability of Earth observation data has outpaced the development of domain libraries to handle its unique challenges, such as varying spatial resolutions, spectral properties, temporal cadence, data coverage, coordinate systems, and file formats. This chapter presents a comprehensive overview of GeoML libraries, analyzing their evolution, core functionalities, and the current ecosystem. It also introduces popular GeoML libraries such as TorchGeo, eo-learn, and Raster Vision, detailing their architecture, supported data types, and integration with ML frameworks. Additionally, it discusses common methodologies for data preprocessing, spatial--temporal joins, benchmarking, and the use of pretrained models. Through a case study in crop type mapping, it demonstrates practical applications of these tools. Best practices in software design, licensing, and testing are highlighted, along with open challenges and future directions, particularly the rise of foundation models and the need for governance in open-source geospatial software. Our aim is to guide practitioners, developers, and researchers in navigating and contributing to the rapidly evolving GeoML landscape.
翻译:近年来,机器学习领域的进展得益于特定领域软件库的出现,这些库实现了工作流程的简化和可重复性的提升。对于地理空间机器学习而言,地球观测数据的可用性已超过了领域库的发展速度,这些领域库旨在处理其独特的挑战,例如变化的空间分辨率、光谱特性、时间频率、数据覆盖范围、坐标系和文件格式。本章对GeoML库进行了全面概述,分析了其演变历程、核心功能以及当前生态系统。同时,介绍了TorchGeo、eo-learn和Raster Vision等流行的GeoML库,详细说明了它们的架构、支持的数据类型以及与机器学习框架的集成。此外,本章还讨论了数据预处理、时空连接、基准测试以及预训练模型使用的常用方法。通过一个作物类型制图的案例研究,展示了这些工具的实际应用。文中重点阐述了软件设计、许可和测试方面的最佳实践,并探讨了开放挑战与未来方向,特别是基础模型的兴起以及对开源地理空间软件治理的需求。我们的目标是指导从业者、开发人员和研究人员在快速发展的GeoML领域中探索并做出贡献。