Deep Learning (DL) methods have emerged as one of the most powerful tools for functional approximation and prediction. While the representation properties of DL have been well studied, uncertainty quantification remains challenging and largely unexplored. Data augmentation techniques are a natural approach to provide uncertainty quantification and to integrate stochastic MCMC search with stochastic gradient descent (SGD) methods. The purpose of our paper is to show that training DL architectures with data augmentation leads to efficiency gains. To demonstrate our methodology, we develop data augmentation algorithms for a variety of commonly used activation functions: logit, ReLU and SVM. Our methodology is compared with traditional stochastic gradient descent with back-propagation. Our optimization procedure leads to a version of iteratively re-weighted least squares and can be implemented at scale with accelerated linear algebra methods providing substantial performance improvement. We illustrate our methodology on a number of standard datasets. Finally, we conclude with directions for future research.
翻译:深度学习(DL)方法已成为功能近似和预测的最有力工具之一。虽然对DL的代表性特性进行了很好的研究,但不确定性的量化仍然具有挑战性,而且基本上尚未探索。数据增强技术是一种自然的方法,可以提供不确定性的量化,并将随机的MCMC搜索与随机性梯度梯度下降(SGD)方法结合起来。我们的论文的目的是显示,对DL结构进行数据增强培训可提高效率。为了展示我们的方法,我们为各种常用的激活功能(logit、RELU和SVM)开发了数据增强算法。我们的方法与传统的随机梯度梯度下降和反向调整进行了比较。我们的优化程序导致一种迭代再加权最小方位的版本,并且可以以加速线性代数法大规模实施,从而大大改进性能。我们用一些标准数据集来说明我们的方法。最后,我们用未来研究的方向来结束。