深神经网络激活梯度 (Activated Gradients for Deep Neural Networks)

Deep neural networks often suffer from poor performance or even training failure due to the ill-conditioned problem, the vanishing/exploding gradient problem, and the saddle point problem. In this paper, a novel method by acting the gradient activation function (GAF) on the gradient is proposed to handle these challenges. Intuitively, the GAF enlarges the tiny gradients and restricts the large gradient. Theoretically, this paper gives conditions that the GAF needs to meet, and on this basis, proves that the GAF alleviates the problems mentioned above. In addition, this paper proves that the convergence rate of SGD with the GAF is faster than that without the GAF under some assumptions. Furthermore, experiments on CIFAR, ImageNet, and PASCAL visual object classes confirm the GAF's effectiveness. The experimental results also demonstrate that the proposed method is able to be adopted in various deep neural networks to improve their performance. The source code is publicly available at https://github.com/LongJin-lab/Activated-Gradients-for-Deep-Neural-Networks.

翻译：深神经网络往往由于条件不良的问题、消失/爆炸的梯度问题和支撑点问题,而表现不佳,甚至培训失败。在本文中,提议采用一种新方法,在梯度上运用梯度激活功能(GAF)来应付这些挑战。自然,GAF扩大微小梯度,限制大梯度。从理论上讲,本文件给出了GAF需要满足的条件,并在此基础上证明GAF缓解了上述问题。此外,本文件证明SGD与GAF的趋同率比没有GAF在某些假设下的速度要快。此外,在CIFAR、图像网络和PASCAL视觉对象类上进行的实验证实了GAF的功效。实验结果还表明,各种深层神经网络都能够采用拟议的方法来改进其性能。源代码可在https://github.com/LongJin-lab/Adivated-GArients-for-Deep-Neural-Networks上公开查阅。

相关内容

Neural Networks

关注 1648

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

【Cell】神经算法推理，Neural algorithmic reasoning

专知会员服务

29+阅读 · 2021年7月16日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【KDD2020】更深的图神经网络，Towards Deeper Graph Neural Networks

专知会员服务

90+阅读 · 2020年7月22日