三重 " 深语系分层:面向效率的设计、时间和深度意识设计 " 的三重审评</s> (A Threefold Review on Deep Semantic Segmentation: Efficiency-oriented, Temporal and Depth-aware design)

Semantic image and video segmentation stand among the most important tasks in computer vision nowadays, since they provide a complete and meaningful representation of the environment by means of a dense classification of the pixels in a given scene. Recently, Deep Learning, and more precisely Convolutional Neural Networks, have boosted semantic segmentation to a new level in terms of performance and generalization capabilities. However, designing Deep Semantic Segmentation models is a complex task, as it may involve application-dependent aspects. Particularly, when considering autonomous driving applications, the robustness-efficiency trade-off, as well as intrinsic limitations - computational/memory bounds and data-scarcity - and constraints - real-time inference - should be taken into consideration. In this respect, the use of additional data modalities, such as depth perception for reasoning on the geometry of a scene, and temporal cues from videos to explore redundancy and consistency, are promising directions yet not explored to their full potential in the literature. In this paper, we conduct a survey on the most relevant and recent advances in Deep Semantic Segmentation in the context of vision for autonomous vehicles, from three different perspectives: efficiency-oriented model development for real-time operation, RGB-Depth data integration (RGB-D semantic segmentation), and the use of temporal information from videos in temporally-aware models. Our main objective is to provide a comprehensive discussion on the main methods, advantages, limitations, results and challenges faced from each perspective, so that the reader can not only get started, but also be up to date in respect to recent advances in this exciting and challenging research field.

翻译：语义图像和视频分解是当今计算机愿景中最重要的任务之一,因为这些模型通过对特定场景的像素进行密集分类,提供了对环境的完整和有意义的代表。最近,深层学习和更精确的进化神经网络,从性能和概括能力方面将语义分解提升到一个新的水平。然而,设计深层语义分解模型是一项复杂的任务,因为它可能涉及应用依赖的方面。特别是,在考虑自主驱动应用程序时,稳健性效率的权衡以及内在限制――在某个场景中对像素进行密集的分类,从而提供了对环境环境的完整和有意义的表述。最近,深层学习,以及更精确的神经网络网络网络网络网络将语义分解提高到一个新的水平。然而,设计深层语义分解模型和视频分解模式是一个复杂的任务,但对于其全部潜力来说,我们在这一文件中,我们对于深层语义分解中最相关和最新进展的调查,因此,从自主飞行器的视野中,从实时的每个方向到我们的主要数据分流流流流流流流到我们的主要分流数据分流的每个方向,只能提供最新数据分流数据。</s>

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日