We introduce Gravity, another algorithm for gradient-based optimization. In this paper, we explain how our novel idea change parameters to reduce the deep learning model's loss. It has three intuitive hyper-parameters that the best values for them are proposed. Also, we propose an alternative to moving average. To compare the performance of the Gravity optimizer with two common optimizers, Adam and RMSProp, five standard datasets were trained on two VGGNet models with a batch size of 128 for 100 epochs. Gravity hyper-parameters did not need to be tuned for different models. As will be explained more in the paper, to investigate the direct impact of the optimizer itself on loss reduction no overfitting prevention technique was used. The obtained results show that the Gravity optimizer has more stable performance than Adam and RMSProp and gives greater values of validation accuracy for datasets with more output classes like CIFAR-100 (Fine).
翻译:我们引入了重力, 即基于梯度优化的另一种算法。 在本文中, 我们解释我们的新理念如何改变参数以减少深学习模型的损失。 它有三个直观的超参数, 可以提出最佳值。 另外, 我们提出移动平均值的替代。 为了将重力优化器的性能与两个共同的优化器( Adam 和 RMSProp ) 进行比较, 共培训了五套标准数据集, 两套VGGNet模型的批量尺寸为 128 / 100 ops。 重力超参数不需要为不同的模型调整。 重力超参数将进一步解释, 以调查优化器本身对减少损失的直接影响, 但没有使用过量的预防技术。 所获得的结果表明, 重力优化器的性能比 Adam 和 RMSProp 更稳定, 并且为像 CIFAR- 100 ( Fine) 这样的输出类的数据集提供更高的验证精度值 。