Semantic segmentation aims to robustly predict coherent class labels for entire regions of an image. It is a scene understanding task that powers real-world applications (e.g., autonomous navigation). One important application, the use of imagery for automated semantic understanding of pedestrian environments, provides remote mapping of accessibility features in street environments. This application (and others like it) require detailed geometric information of geographical objects. Semantic segmentation is a prerequisite for this task since it maps contiguous regions of the same class as single entities. Importantly, semantic segmentation uses like ours are not pixel-wise outcomes; however, most of their quantitative evaluation metrics (e.g., mean Intersection Over Union) are based on pixel-wise similarities to a ground-truth, which fails to emphasize over- and under-segmentation properties of a segmentation model. Here, we introduce a new metric to assess region-based over- and under-segmentation. We analyze and compare it to other metrics, demonstrating that the use of our metric lends greater explainability to semantic segmentation model performance in real-world applications.
翻译:语义分解的目的是为图像的整个区域强有力地预测一致的分类标签。这是一个现场理解任务,它赋予了真实世界的应用(例如自主导航)权力。一个重要应用,即使用图像对行人环境进行自动的语义理解,对街道环境中的无障碍特征进行远程测绘。这种应用(以及其他类似应用)要求对地理物体进行详细的几何信息。语义分解是这项任务的一个先决条件,因为它绘制了与单一实体同级的相邻区域。重要的是,像我们这样的语义分解使用不是像素一样的结果;然而,它们的大多数定量评估指标(例如,平均横跨联盟)都基于像素的相似之处,没有强调分解模型的超度和分解不足特性。在这里,我们引入了一个新的衡量尺度,以评估基于区域的跨度和分解不足的分解特性。我们分析并比较了它与其他指标,表明我们指标的使用对于真实世界应用中的语义分解模型性能有更大的可解释性。