Semantic segmentation from aerial views is a vital task for autonomous drones as they require precise and accurate segmentation to traverse safely and efficiently. Segmenting images from aerial views is especially challenging as they include diverse view-points, extreme scale variation and high scene complexity. To address this problem, we propose an end-to-end multi-class semantic segmentation diffusion model. We introduce recursive denoising which allows predicted error to propagate through the denoising process. In addition, we combine this with a hierarchical multi-scale approach, complementary to the diffusion process. Our method achieves state-of-the-art results on UAVid and on the Vaihingen building segmentation benchmark.
翻译:空中观察的语义分解是自主无人机的一项重要任务,因为它们需要精确和准确的分解,以便安全和高效地穿行。 从空中观察的图像特别具有挑战性,因为它们包括不同的观察点、极端规模的变异和高度的场景复杂性。为了解决这一问题,我们提议了一个端到端多级的语义分解扩散模型。我们引入了循环分解模式,允许通过分解过程传播预测的错误。此外,我们将此与分级的多尺度方法相结合,对传播过程起到补充作用。我们的方法在UAvid和Vaihingen建筑分解基准上取得了最新的结果。