Accurate mapping of oil palm is important for understanding its past and future impact on the environment. We propose to map and count oil palms by estimating tree densities per pixel for large-scale analysis. This allows for fine-grained analysis, for example regarding different planting patterns. To that end, we propose a new, active deep learning method to estimate oil palm density at large scale from Sentinel-2 satellite images, and apply it to generate complete maps for Malaysia and Indonesia. What makes the regression of oil palm density challenging is the need for representative reference data that covers all relevant geographical conditions across a large territory. Specifically for density estimation, generating reference data involves counting individual trees. To keep the associated labelling effort low we propose an active learning (AL) approach that automatically chooses the most relevant samples to be labelled. Our method relies on estimates of the epistemic model uncertainty and of the diversity among samples, making it possible to retrieve an entire batch of relevant samples in a single iteration. Moreover, our algorithm has linear computational complexity and is easily parallelisable to cover large areas. We use our method to compute the first oil palm density map with $10\,$m Ground Sampling Distance (GSD) , for all of Indonesia and Malaysia and for two different years, 2017 and 2019. The maps have a mean absolute error of $\pm$7.3 trees/$ha$, estimated from an independent validation set. We also analyse density variations between different states within a country and compare them to official estimates. According to our estimates there are, in total, $>1.2$ billion oil palms in Indonesia covering $>$15 million $ha$, and $>0.5$ billion oil palms in Malaysia covering $>6$ million $ha$.
翻译:准确绘制油棕榈地图对于了解其过去和今后对环境的影响非常重要。 我们提议通过估算每像素树密度来绘制和计算油棕榈,以便进行大规模分析。 这样可以进行精细分析,例如关于不同种植模式的分析。 为此,我们提议采用新的、积极的深层学习方法,从Sentinel-2卫星图像中大规模估算油棕榈密度,并应用这种方法为马来西亚和印度尼西亚绘制完整的地图。 导致油棕榈密度回归具有挑战性的因素是需要具有代表性的参考数据,涵盖大领土所有相关地理条件。 具体地说,对于密度估算而言,生成参考数据涉及单个树数。为了保持相关的标签努力,我们提议采用积极学习(AL)的方法,自动选择最相关的样品。 为此,我们的方法依赖于对缩影模型不确定性和样本多样性的估计,从而有可能在一次深度中取回一整批相关样品。 此外,我们的算法具有线性计算复杂性,而且很容易覆盖大地区。我们使用的方法,将首次的油棕榈变量进行对比,同时测量马来西亚的深度和地表中所有两百万美元。