Semantic segmentation has achieved great accuracy in understanding spatial layout. For real-time tasks based on dynamic scenes, we extend semantic segmentation in temporal domain to enhance the spatial accuracy with motion. We utilize a shift-mode network over streaming input to ensure zero-latency output. For the data overlap under shifting network, this paper identifies repeated computation in fixed periods across network layers. To avoid this redundancy, we derive a Shift-Memory Network (SMN) from encoding-decoding baseline to reuse the network values without accuracy loss. Trained in patch-mode, the SMN extracts the network parameters for SMN to perform inference promptly in compact memory. We segment dynamic scenes from 1D scanning input and 2D video. The experiments of SMN achieve equivalent accuracy as shift-mode but in faster inference speeds and much smaller memory. This will facilitate semantic segmentation in real-time application on edge devices.
翻译:对于基于动态场景的实时任务,我们在时空域中扩展了语义区段,以通过运动提高空间精确度。我们使用流输入的移动式模式网络,以确保零纬度输出。对于在移动网络下的数据重叠,本文指出在移动网络下的数据重叠,在固定时间跨网络层的重复计算。为了避免这种冗余,我们从编码-解码基线到再利用网络值,而不造成准确性损失。在补丁模式中,SMN提取SMN的网络参数,以便在紧凑内存中迅速进行推断。我们从 1D 扫描输入和 2D 视频中截取动态场点。SMN的实验在移动- 模式中实现了等同的精确度,但以更快的推断速度和小得多的内存。这将便利在边缘设备上实时应用的语义区段。