渐进推进神经网络的序列培训 (Sequential Training of Neural Networks with Gradient Boosting)

This paper presents a novel technique based on gradient boosting to train a shallow neural network (NN). Gradient boosting is an additive expansion algorithm in which a series of models are trained sequentially to approximate a given function. A neural network can also be seen as an additive model where the scalar product of the responses of the last hidden layer and its weights provide the final output of the network. Instead of training the network as a whole, the proposed algorithm trains the network sequentially in $T$ steps. First, the bias term of the network is initialized with a constant approximation that minimizes the average loss of the data. Then, at each step, a portion of the network, composed of $J$ neurons, is trained to approximate the pseudo-residuals on the training data computed from the previous iterations. Finally, the $T$ partial models and bias are integrated as a single NN with $T \times J$ neurons in the hidden layer. Extensive experiments in classification and regression tasks are carried out showing a competitive generalization performance with respect to neural networks trained with different standard solvers, such as Adam, L-BFGS and SGD. Furthermore, we show that the proposed method design permits to switch off a number of hidden units during test (the units that were last trained) without a significant reduction of its generalization ability. This permits the adaptation of the model to different classification speed requirements on the fly.

翻译：本文介绍了一种基于梯度增压以训练浅神经网络(NN)的新技术。梯度增压是一种累进扩张算法, 其中一系列模型被按顺序训练, 以近似给定函数。神经网络也可以被视为一种添加模型, 其中最后隐藏层反应及其加权数的缩略图产品提供了网络的最终输出。拟议的算法不是对整个网络进行培训, 而是将网络依次地用$T 来训练。首先, 网络的偏差期初始化为不断接近, 以最大限度地减少数据的平均损失。然后, 每一步, 网络中的一部分由美元神经元组成, 被训练为根据从先前的迭代计算的培训数据近似假的缩影。最后, $T 部分模型和偏差作为单一的NNN, 以$T\ time J$ 在隐藏层的神经元为单位。在分类和回归任务中进行广泛的实验, 显示与由不同标准解算器所训练的神经网络的竞争性总体化表现, 由$ JBF 和S 在最后的测试模型设计中, 显示一个显著的缩缩缩到最后的缩略图。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日