Training of convolutional neural networks is a high dimensional and a non-convex optimization problem. At present, it is inefficient in situations where parametric learning rates can not be confidently set. Some past works have introduced Newton methods for training deep neural networks. Newton methods for convolutional neural networks involve complicated operations. Finding the Hessian matrix in second-order methods becomes very complex as we mainly use the finite differences method with the image data. Newton methods for convolutional neural networks deals with this by using the sub-sampled Hessian Newton methods. In this paper, we have used the complete data instead of the sub-sampled methods that only handle partial data at a time. Further, we have used parallel processing instead of serial processing in mini-batch computations. The results obtained using parallel processing in this study, outperform the time taken by the previous approach.
翻译:基于牛顿法的卷积神经网络使用并行处理
Translated Abstract:
卷积神经网络的训练是一个高维,非凸优化问题。目前,当参数学习率不能被自信地设置时,它效率低下。一些过去的研究引入了牛顿方法来训练深度神经网络。对于卷积神经网络,牛顿方法涉及复杂的运算。通过使用子采样的Hessian矩阵牛顿方法,可以避免二阶方法中找到Hessian矩阵变得非常复杂,因为我们主要使用图像数据进行有限差分方法。本文中,我们使用完整数据而不是只在处理部分数据的子采样方法。此外,我们在小批量计算中使用并行处理而不是串行处理。本研究使用并行处理得到的结果优于先前方法所花费的时间。