It is well known that semantic segmentation neural networks (SSNNs) produce dense segmentation maps to resolve the objects' boundaries while restrict the prediction on down-sampled grids to alleviate the computational cost. A striking balance between the accuracy and the training cost of the SSNNs such as U-Net exists. We propose a spectral analysis to investigate the correlations among the resolution of the down sampled grid, the loss function and the accuracy of the SSNNs. By analyzing the network back-propagation process in frequency domain, we discover that the traditional loss function, cross-entropy, and the key features of CNN are mainly affected by the low-frequency components of segmentation labels. Our discoveries can be applied to SSNNs in several ways including (i) determining an efficient low resolution grid for resolving the segmentation maps (ii) pruning the networks by truncating the high frequency decoder features for saving computation costs, and (iii) using block-wise weak annotation for saving the labeling time. Experimental results shown in this paper agree with our spectral analysis for the networks such as DeepLab V3+ and Deep Aggregation Net (DAN).
翻译:众所周知,语系分解神经网络(SSNNs)产生密集的分解图,以解决物体的边界,同时限制对下标网格的预测,以降低计算成本;在诸如U-Net等SSNNs的准确性和培训成本之间保持平衡;我们提议进行光谱分析,以调查下标网的分辨率、损失功能和SSNNs准确性之间的相互关系;通过分析频率域内的网络背对调进程,我们发现CNN的传统损失功能、交叉渗透和关键特征主要受到分解标签低频率组成部分的影响;我们发现,我们发现的情况可以以多种方式应用于SSNNNNNs,包括(一) 确定高效的低分辨率网格,以解决分解图;(二) 调整高频分解功能,以节省计算费用;(三) 使用块微分辨微的注解,以节省标签时间。本文显示的实验结果与我们关于深L3+深磁层等网络的光谱分析一致。