Semantic segmentation arises as the backbone of many vision systems, spanning from self-driving cars and robot navigation to augmented reality and teleconferencing. Frequently operating under stringent latency constraints within a limited resource envelope, optimising for efficient execution becomes important. At the same time, the heterogeneous capabilities of the target platforms and diverse constraints of different applications require the design and training of multiple target-specific segmentation models, leading to excessive maintenance costs. To this end, we propose a framework for converting state-of-the-art segmentation CNNs to Multi-Exit Semantic Segmentation (MESS) networks: specially trained models that employ parametrised early exits along their depth to i) dynamically save computation during inference on easier samples and ii) save training and maintenance cost by offering a post-training customisable speed-accuracy trade-off. Designing and training such networks naively can hurt performance. Thus, we propose novel two-staged training scheme for multi-exit networks. Furthermore, the parametrisation of MESS enables co-optimising the number, placement and architecture of the attached segmentation heads along with the exit policy, upon deployment via exhaustive search in <1GPUh. This allows MESS to rapidly adapt to the device capabilities and application requirements for each target use-case, offering a train-once-deploy-everywhere solution. MESS variants achieve latency gains of up to 2.83x with the same accuracy, or 5.33 pp higher accuracy for the same computational budget, compared to the original backbone network. Lastly, MESS delivers orders of magnitude faster architecture selection, compared to state-of-the-art techniques.
翻译:语义分解是许多视觉系统的支柱,从自行驾驶的汽车和机器人导航到扩大现实和电话会议。通常在有限的资源范围内在严格的延迟限制下运行,优化高效执行变得非常重要。与此同时,目标平台的多种能力和不同应用的各种限制要求设计和培训多个特定目标分解模型,导致过多的维护成本。为此,我们提议了一个框架,将最先进的CNN转换为多级超高超分解网络:经过专门培训的模式,在深度到i的精度早期出口中采用超优的早期出口,在较轻松的样本中快速节省计算;同时,目标平台的多样化能力和不同应用的多种限制要求需要设计和培训多个特定目标分解模型,从而导致过度维护成本。为此,我们提议了一个将高水平的全级全级全级全级CNIS培训计划转换为多级多级分解网络。此外,多级超级超级超级分解能够将原级精度早期出口的精度输出点数、比对低级的早期出口的早期出口进行专门模型分析,在较深的样本和二度的精度选择过程中将精度网络的精度调整,使每个选择的精度的精度的精度应用技术能够向下进行快速选择,使最低的精度转换到最低的精度应用,使最低的精度应用的精度的精度的精度的精度在每次选择,将精度的精度转换到最细度应用到最低的精度,将精度应用技术,将精度调整到离度应用到最低端结构,将精度调整到离。