In our work, we propose a novel yet simple approach to obtain an adaptive learning rate for gradient-based descent methods on classification tasks. Instead of the traditional approach of selecting adaptive learning rates via the decayed expectation of gradient-based terms, we use the angle between the current gradient and the new gradient: this new gradient is computed from the direction orthogonal to the current gradient, which further helps us in determining a better adaptive learning rate based on angle history, thereby, leading to relatively better accuracy compared to the existing state-of-the-art optimizers. On a wide variety of benchmark datasets with prominent image classification architectures such as ResNet, DenseNet, EfficientNet, and VGG, we find that our method leads to the highest accuracy in most of the datasets. Moreover, we prove that our method is convergent.
翻译:在我们的工作中,我们提出了一种新颖而简单的方法,用于在分类任务上获得梯度下降方法的自适应学习率。与传统方法通过衰减梯度基于术语的期望来选择自适应学习率相比,我们使用当前梯度和新梯度之间的角度:该新梯度由垂直于当前梯度方向的方向计算而来,从而帮助我们根据角度历史确定更好的自适应学习率,从而在大多数数据集中相对于现有的最先进的优化器实现更高的准确性。我们在广泛的基准数据集上进行测试,其中包括ResNet、DenseNet、EfficientNet和VGG等主要的图像分类架构,发现我们的方法在大多数数据集中都取得了最高的准确率。此外,我们证明了我们的方法是收敛的。