It is widely believed that deep neural networks contain layer specialization, wherein networks extract hierarchical features representing edges and patterns in shallow layers and complete objects in deeper layers. Unlike common feed-forward models that have distinct filters at each layer, recurrent networks reuse the same parameters at various depths. In this work, we observe that recurrent models exhibit the same hierarchical behaviors and the same performance benefits as depth despite reusing the same filters at every recurrence. By training models of various feed-forward and recurrent architectures on several datasets for image classification as well as maze solving, we show that recurrent networks have the ability to closely emulate the behavior of non-recurrent deep models, often doing so with far fewer parameters.
翻译:人们广泛认为,深神经网络包含分层专门化,其中各网络提取的等级特征代表浅层的边缘和模式,以及更深层的完整物体。与在每一层有不同过滤器的常见进料前向模型不同,重复式网络在不同深度重复使用相同的参数。在这项工作中,我们观察到,尽管每次重复使用相同的过滤器,但重复式模型表现出同样的等级行为和同样的性能效益。通过对用于图像分类和迷宫解密的若干数据集的各种进料前方和经常性结构模型进行培训,我们表明,经常式网络有能力密切地模仿非经常性深层模型的行为,而经常采用的参数往往要少得多。