具有Gated经常连接的革命神经网络 (Convolutional Neural Networks with Gated Recurrent Connections)

from arxiv, Accepted by TPAMI. An extension of our previous NeurIPS 2017 paper "Gated recurrent convolution neural network for OCR". We demonstrate the good performance of GRCNN on image classification and object detection. Codes are available at: https://github.com/Jianf-Wang/GRCNN

The convolutional neural network (CNN) has become a basic model for solving many computer vision problems. In recent years, a new class of CNNs, recurrent convolution neural network (RCNN), inspired by abundant recurrent connections in the visual systems of animals, was proposed. The critical element of RCNN is the recurrent convolutional layer (RCL), which incorporates recurrent connections between neurons in the standard convolutional layer. With increasing number of recurrent computations, the receptive fields (RFs) of neurons in RCL expand unboundedly, which is inconsistent with biological facts. We propose to modulate the RFs of neurons by introducing gates to the recurrent connections. The gates control the amount of context information inputting to the neurons and the neurons' RFs therefore become adaptive. The resulting layer is called gated recurrent convolution layer (GRCL). Multiple GRCLs constitute a deep model called gated RCNN (GRCNN). The GRCNN was evaluated on several computer vision tasks including object recognition, scene text recognition and object detection, and obtained much better results than the RCNN. In addition, when combined with other adaptive RF techniques, the GRCNN demonstrated competitive performance to the state-of-the-art models on benchmark datasets for these tasks. The codes are released at \href{https://github.com/Jianf-Wang/GRCNN}{https://github.com/Jianf-Wang/GRCNN}.

翻译：综合神经网络(CNN)已成为解决许多计算机视觉问题的基本模式。近年来,有人提议在动物视觉系统中大量反复连接的启发下,建立新型CNN(CNN)网络,即经常性神经网络(RCNN);RCNN(RCL)的关键元素是经常性的共变层(RCL),它包含标准革命层神经元之间的经常性连接;随着经常性计算数量的增加,RCL神经元的接收字段(RF)无限制地扩展,这与生物事实不符。我们提议通过引入经常性连接的大门来调节神经神经元的RF(RNN);因此,门户控制向神经元和神经神经元的RF(RCN)输入环境信息的数量变得适应性。由此形成的层被称为封闭式的循环层(GRCL);多个GRCL(RL)构成一个叫做GNNN(GCNNN)的深层模型。GRCNNN(R) 包括目标识别、现场文本识别和对象探测,并比RCCNNNN/NRC(NR)获得更好的结果。此外,当这些测试了G-NNRR(G-G-G-GRD)的测试技术与这些测试-RD-RD(G-G-C-RDRD-C-sentregR)的模型结合,这些测试,这些测试了这些基准,这些测试/G-G-RD-RD-C-C-C-C-C-CFD-CF)的模型。