In established network architectures, shortcut connections are often used to take the outputs of earlier layers as additional inputs to later layers. Despite the extraordinary effectiveness of shortcuts, there remain open questions on the mechanism and characteristics. For example, why are shortcuts powerful? Why do shortcuts generalize well? In this paper, we investigate the expressivity and generalizability of a novel sparse shortcut topology. First, we demonstrate that this topology can empower a one-neuron-wide deep network to approximate any univariate continuous function. Then, we present a novel width-bounded universal approximator in contrast to depth-bounded universal approximators and extend the approximation result to a family of equally competent networks. Furthermore, with generalization bound theory, we show that the proposed shortcut topology enjoys excellent generalizability. Finally, we corroborate our theoretical analyses by comparing the proposed topology with popular architectures, including ResNet and DenseNet, on well-known benchmarks and perform a saliency map analysis to interpret the proposed topology. Our work helps enhance the understanding of the role of shortcuts and suggests further opportunities to innovate neural architectures.
翻译:在已有的网络结构中,捷径连接常常被用来将早期层的产出作为附加投入提供给后层。尽管捷径的超常效果,但机制及其特点仍有一些未解的问题。例如,为什么捷径是强大的?为什么捷径是全面的?为什么捷径是全面的?在本文中,我们调查了新颖的稀有捷径地形的表达性和可概括性。首先,我们证明,这一地形学可以使整个单一中子深层的网络能够接近任何单一的连续功能。然后,我们提出了一个新的宽度宽度通用近似器,与深度的通用近似器形成对照,并将近似结果推广到同样合格的网络的大家庭。此外,根据一般化约束理论,我们表明拟议的捷径表学具有极佳的可概括性。最后,我们通过将拟议的地形学与流行结构(包括ResNet和DenseNet)进行比较,根据众所周知的基准进行理论分析,并进行显著的地图分析,以解释拟议的地形学。我们的工作有助于增进对捷径的作用的理解,并提出创新神经结构的进一步机会。