SIRE-Networks:通过跳过/恢复连接和相互连接自动编码器保存信息的革命神经网络建筑扩展 (SIRe-Networks: Convolutional Neural Networks Architectural Extension for Information Preservation via Skip/Residual Connections and Interlaced Auto-Encoders)

Networking · Extensibility · INFORMS · Neural Networks · 讲稿 ·

2022 年 10 月 26 日

SIRe-Networks: Convolutional Neural Networks Architectural Extension for Information Preservation via Skip/Residual Connections and Interlaced Auto-Encoders

翻译：SIRE-Networks:通过跳过/恢复连接和相互连接自动编码器保存信息的革命神经网络建筑扩展

Danilo Avola,Luigi Cinque,Alessio Fagioli,Gian Luca Foresti

Improving existing neural network architectures can involve several design choices such as manipulating the loss functions, employing a diverse learning strategy, exploiting gradient evolution at training time, optimizing the network hyper-parameters, or increasing the architecture depth. The latter approach is a straightforward solution, since it directly enhances the representation capabilities of a network; however, the increased depth generally incurs in the well-known vanishing gradient problem. In this paper, borrowing from different methods addressing this issue, we introduce an interlaced multi-task learning strategy, defined SIRe, to reduce the vanishing gradient in relation to the object classification task. The presented methodology directly improves a convolutional neural network (CNN) by preserving information from the input image through interlaced auto-encoders (AEs), and further refines the base network architecture by means of skip and residual connections. To validate the presented methodology, a simple CNN and various implementations of famous networks are extended via the SIRe strategy and extensively tested on five collections, i.e., MNIST, Fashion-MNIST, CIFAR-10, CIFAR-100, and Caltech-256; where the SIRe-extended architectures achieve significantly increased performances across all models and datasets, thus confirming the presented approach effectiveness.

翻译：改善现有的神经网络结构可能涉及若干设计选择,例如操纵损失功能、采用不同的学习战略、利用培训时的梯度演化、优化网络超参数或提高结构深度,后者是一个直接的解决办法,因为它直接增强了网络的代表性能力;然而,由于深度的提高,通常导致众所周知的消失梯度问题。在本文件中,从解决这一问题的不同方法中借款,我们采用了一个相互交织的多任务学习战略,定义了SIRE,以减少与物体分类任务有关的渐变。提出的方法通过相互连接的自动编码器(AEs)从输入图像中保留信息,直接改进了神经网络,并通过跳动和剩余连接进一步完善了基础网络结构。为了验证所提出的方法,通过SIRe战略扩展了简单的CNN和有名网络的各种实施方式,并广泛测试了5种收集方法,即:MNIST、Fashion-MNIST、CIFAR-10、CIFAR-100和Caltech 256, 所介绍的方法直接改进了从输入图像图像中获取的信息,从而确认了所展示的所有SIS-RE-S-Setsset as asional asional as asional acals pappressals pappressionals pappressionals pals pals.

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日