Semantic segmentation arises as the backbone of many vision systems, spanning from self-driving cars and robot navigation to augmented reality and teleconferencing. Frequently operating under stringent latency constraints within a limited resource envelope, optimising for efficient execution becomes important. At the same time, the heterogeneous capabilities of the target platforms and the diverse constraints of different applications require the design and training of multiple target-specific segmentation models, leading to excessive maintenance costs. To this end, we propose a framework for converting state-of-the-art segmentation CNNs to Multi-Exit Semantic Segmentation (MESS) networks: specially trained models that employ parametrised early exits along their depth to i) dynamically save computation during inference on easier samples and ii) save training and maintenance cost by offering a post-training customisable speed-accuracy trade-off. Designing and training such networks naively can hurt performance. Thus, we propose a novel two-staged training scheme for multi-exit networks. Furthermore, the parametrisation of MESS enables co-optimising the number, placement and architecture of the attached segmentation heads along with the exit policy, upon deployment via exhaustive search in <1 GPUh. This allows MESS to rapidly adapt to the device capabilities and application requirements for each target use-case, offering a train-once-deploy-everywhere solution. MESS variants achieve latency gains of up to 2.83x with the same accuracy, or 5.33 pp higher accuracy for the same computational budget, compared to the original backbone network. Lastly, MESS delivers orders of magnitude faster architectural customisation, compared to state-of-the-art techniques.
翻译:语义分解是许多视觉系统的骨干,从自行驾驶汽车和机器人导航到扩大现实和电话会议。通常在有限的资源封套内,在严格的隐性限制下运行,优化高效执行变得非常重要。与此同时,目标平台的多种能力和不同应用的各种限制要求设计和培训多个特定目标分解模型,导致过多的维护成本。为此,我们提议了一个框架,用于将最先进的全方位断层CNN转换为多级离心分解(MESS)网络:经过专门培训的模型,在精深处使用极差的早期出口(i),以动态方式节省计算,在较容易的样本和二)的推断期间,优化高效执行。同时,目标平台和不同应用的不同能力需要设计和培训多目标分解模式,导致过度维护成本。因此,我们提议为多输出网络推出一个新的两阶段性级配置培训计划。此外,将MSSES的精度提前输出(MESS)网络的精度转换到更精确性排序,将数字比对精度的精度进行对比,将精度提前输出输出输出,将精度调整到最精确的精度技术到最精确的精度调整到最精度到最精度,将精度,将精度调整到最精度到最精度的精度路端端端端端端端端端端,将精度应用到最精度到最精度技术到最精度到最精度到最精度到最精度,将精度的精度的精度的精度到最精度到最精度到最精度到最精度的精度到最精度到最精度到最精度,将精度到最精度到最精度,将精度到最精度,将精度的精度到最精度到最精度,将精度到最精度到最精度到最精度,将精度,将精度到最精度到最精度到最精度到最精度,将精度到最精度到最精度到最精度,将精度,将精度到最精度到最精度到最精度,将精度,将精度到最精度,将精度到最精度到最精度到最精度到