We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. Furthermore, to allow maximal adaptivity, the weights at each decoder block vary spatially. For this purpose, we design a new type of hypernetwork, composed of a nested U-Net for drawing higher level context features, a multi-headed weight generating module which generates the weights of each block in the decoder immediately before they are consumed, for efficient memory utilization, and a primary network that is composed of novel dynamic patch-wise convolutions. Despite the usage of less-conventional blocks, our architecture obtains real-time performance. In terms of the runtime vs. accuracy trade-off, we surpass state of the art (SotA) results on popular semantic segmentation benchmarks: PASCAL VOC 2012 (val. set) and real-time semantic segmentation on Cityscapes, and CamVid. The code is available: https://nirkin.com/hyperseg.
翻译:我们提出了一个新颖的实时语义分割网,编码器在其中编码并生成解码器的参数(重量),此外,为了允许最大适应性,每个解码器块的重量在空间上各不相同。为此,我们设计了新型超网络,由嵌套的U-Net组成,用于绘制更高层次的内涵特征,一个多头重生成模块,在消化前生成每个区块的重量,以便有效地利用记忆,以及一个由新颖的动态补丁共变组成的主要网络。尽管使用了较不常规的区块,但我们的建筑获得了实时性能。在运行时间与准确性权衡方面,我们超过了流行语义分割基准的艺术(SotA)结果:PASCAL VOC 2012(val.set)和城市景区实时语义分割,CamVid。代码是:https://nirkin.com/hypereg。