Antrophonegic pressure (i.e. human influence) on the environment is one of the largest causes of the loss of biological diversity. Wilderness areas, in contrast, are home to undisturbed ecological processes. However, there is no biophysical definition of the term wilderness. Instead, wilderness is more of a philosophical or cultural concept and thus cannot be easily delineated or categorized in a technical manner. With this paper, (i) we introduce the task of wilderness mapping by means of machine learning applied to satellite imagery (ii) and publish MapInWild, a large-scale benchmark dataset curated for that task. MapInWild is a multi-modal dataset and comprises various geodata acquired and formed from a diverse set of Earth observation sensors. The dataset consists of 8144 images with a shape of 1920 x 1920 pixels and is approximately 350 GB in size. The images are weakly annotated with three classes derived from the World Database of Protected Areas - Strict Nature Reserves, Wilderness Areas, and National Parks. With the dataset, which shall serve as a testbed for developments in fields such as explainable machine learning and environmental remote sensing, we hope to contribute to a deepening of our understanding of the question "What makes nature wild?".
翻译:环境的萎缩性压力(即人类影响)是造成生物多样性丧失的最大原因之一。荒野地区是不受干扰的生态过程的所在地。然而,没有关于荒野一词的生物物理定义。相反,荒野更是一个哲学或文化概念,因此不易以技术方式加以划定或分类。有了本文件,(一) 我们引入了通过对卫星图像应用机器学习的方式绘制荒野图的任务(二),并出版了MapInWild,这是为此任务而建立的一个大型基准数据集。MapInWild是一个多模式数据集,由从一套不同的地球观测传感器获取和形成的各种地理数据组成。数据集由8144个图像组成,形状为1920x1920像素,大小约为350GB。这些图像的注释性微弱,有三个类别来自世界保护区数据库——严格的自然保护区、荒野区和国家公园。数据集将作为在诸如可解释的不断加深的机器遥感等领域发展的一个测试台。“我们希望,我们对不断加深的自然学和遥感问题的贡献。”