In deep learning, optimization plays a vital role. By focusing on image classification, this work investigates the pros and cons of the widely used optimizers, and proposes a new optimizer: Perturbated Unit Gradient Descent (PUGD) algorithm with extending normalized gradient operation in tensor within perturbation to update in unit space. Via a set of experiments and analyses, we show that PUGD is locally bounded updating, which means the updating from time to time is controlled. On the other hand, PUGD can push models to a flat minimum, where the error remains approximately constant, not only because of the nature of avoiding stationary points in gradient normalization but also by scanning sharpness in the unit ball. From a series of rigorous experiments, PUGD helps models to gain a state-of-the-art Top-1 accuracy in Tiny ImageNet and competitive performances in CIFAR- {10, 100}. We open-source our code at link: https://github.com/hanktseng131415go/PUGD.
翻译:在深层学习中,优化起着关键作用。 通过关注图像分类, 这项工作调查了广泛使用的优化的利弊, 并提出了一个新的优化器: 受扰动的单位梯度源(PUGD)算法, 将正常梯度的操作在振动中在振动中扩展至振动以更新单元空间。 我们通过一系列实验和分析, 显示 PUGD 是本地受约束的更新, 这意味着不时更新得到控制。 另一方面, PUGD 可以将模型推向一个固定的最小值, 错误保持大致不变, 不仅因为避免梯度正常化的固定点的性质, 而且还通过扫描单元球的锐度 。 从一系列严格的实验中, PUGD 帮助模型在小图像网中取得最先进的顶端-1 精确度, 以及在 CIFAR- {10, 100} 的竞争性性能。 我们打开链接的代码源代码 : https://github.com/ hanktseng1315go/ PUGDDD 。