We propose a new neural network design paradigm Reversible Column Network (RevCol). The main body of RevCol is composed of multiple copies of subnetworks, named columns respectively, between which multi-level reversible connections are employed. Such architectural scheme attributes RevCol very different behavior from conventional networks: during forward propagation, features in RevCol are learned to be gradually disentangled when passing through each column, whose total information is maintained rather than compressed or discarded as other network does. Our experiments suggest that CNN-style RevCol models can achieve very competitive performances on multiple computer vision tasks such as image classification, object detection and semantic segmentation, especially with large parameter budget and large dataset. For example, after ImageNet-22K pre-training, RevCol-XL obtains 88.2% ImageNet-1K accuracy. Given more pre-training data, our largest model RevCol-H reaches 90.0% on ImageNet-1K, 63.8% APbox on COCO detection minival set, 61.0% mIoU on ADE20k segmentation. To our knowledge, it is the best COCO detection and ADE20k segmentation result among pure (static) CNN models. Moreover, as a general macro architecture fashion, RevCol can also be introduced into transformers or other neural networks, which is demonstrated to improve the performances in both computer vision and NLP tasks. We release code and models at https://github.com/megvii-research/RevCol
翻译:我们提出一个新的神经网络设计范式校正网络(RevCol) 。 RevCol 的主体由多个子网络的多份副本组成, 分别命名为列, 在其中使用多层次可逆连接。 这种建筑计划赋予RevCol 与常规网络不同的行为: 在前期传播中, RevCol 的功能在通过每列时逐渐被分解, 其信息总量在图像网络上得到维护, 而不是压缩或丢弃与其他网络一样。 我们的实验表明, CNN- 式的RevCol 模型可以在图像分类、 对象探测和语义分割等多个计算机视觉任务上实现非常有竞争力的性能, 特别是大参数预算和大数据集。 例如, 在图像网22K 培训前, RevCol- XL 中, RevCol- XL 获得88% 图像网络的准确性能 。 鉴于培训前的数据更多, 我们最大的RevCol-H 的模型在图像网络1K上达到了90. 63. APbbow on CO moval smissations made s made, 61.. 0ub/ mIOIOOOLUU 在常规网络中, 或CRAVIRC 格式中可以改进常规结构。