K-FAC (arXiv:1503.05671, arXiv:1602.01407) is a tractable implementation of Natural Gradient (NG) for Deep Learning (DL), whose bottleneck is computing the inverses of the so-called ``Kronecker-Factors'' (K-factors). RS-KFAC (arXiv:2206.15397) is a K-FAC improvement which provides a cheap way of estimating the K-factors inverses. In this paper, we exploit the exponential-average construction paradigm of the K-factors, and use online numerical linear algebra techniques to propose an even cheaper (but less accurate) way of estimating the K-factors inverses for Fully Connected layers. Numerical results show RS-KFAC's inversion error can be reduced with minimal CPU overhead by adding our proposed update to it. Based on the proposed procedure, a correction to it, and RS-KFAC, we propose three practical algorithms for optimizing generic Deep Neural Nets. Numerical results show that two of these outperform RS-KFAC for any target test accuracy on CIFAR10 classification with a slightly modified version of VGG16_bn. Our proposed algorithms achieve 91$\%$ test accuracy faster than SENG (the state of art implementation of empirical NG for DL; arXiv:2006.05924) but underperform it for higher test-accuracy.
翻译:K-FAC (arXiv: 1503.05671, arXiv: 1602. 01407) 是深入学习(DL) 的自然梯度(NG) 的可移植实施, 其瓶颈正在计算所谓的“ Kronecker- Fatorers ” (K- 系数) 的反向值。 RS- KFAC (arXiv: 2206. 15397) 是K- FAC 的改进, 提供了估算K- 因素反向的廉价方法。 在本文中, 我们利用K- 要素的指数平均建设模式, 并使用在线数字线性代数(NG) 技术来提出更便宜( 但更不准确) 的方法来估算“ K- 点” 的“ 完全连接层的反向值。 数值结果显示 RS- KFAC 的反向错误可以减少, 通过添加我们提议更新的 CPU 间接费用。 根据拟议程序, 对它进行更正, RS- KFAC, 我们提议三种实际的算法方法来优化通用 NE- Neal Net 。