We propose a learning rate adaptation scheme, called QLAB, for descent optimizers. We derive QLAB by optimizing the quadratic approximation of the loss function and QLAB can be combined with any optimizer who can provide the descent update direction. The computation of an adaptive learning rate with QLAB requires only computing an extra loss function value. We theoretically prove the convergence of the descent optimizers with QLAB. We demonstrate the effectiveness of QLAB in a range of optimization problems by combining with conclusively stochastic gradient descent, stochastic gradient descent with momentum, and Adam. The performance is validated on multi-layer neural networks, CNN, VGG-Net, ResNet and ShuffleNet with two datasets, MNIST and CIFAR10.
翻译:我们提议了一种称为QLAB的学习率适应计划,以优化下降功能的二次近似,我们通过优化损失函数获得QLAB, QLAB可以与任何能够提供后世更新方向的优化者合并。与QLAB计算适应性学习率只需要计算额外的损失函数值。我们理论上证明下降优化者与QLAB的趋同。我们通过结合断然随机梯度梯度下降、随机梯度梯度下降和亚当,证明QLAB在一系列优化问题上的有效性。在多层神经网络(CNN、VGG-Net、ResNet和ShuffleNet)上,通过两个数据集(MNIST和CIFAR10)验证了这一性能。