Deep learning (DL) is transforming industry as decision-making processes are being automated by deep neural networks (DNNs) trained on real-world data. Driven partly by rapidly-expanding literature on DNN approximation theory showing they can approximate a rich variety of functions, such tools are increasingly being considered for problems in scientific computing. Yet, unlike traditional algorithms in this field, little is known about DNNs from the principles of numerical analysis, e.g., stability, accuracy, computational efficiency and sample complexity. In this paper we introduce a computational framework for examining DNNs in practice, and use it to study empirical performance with regard to these issues. We study performance of DNNs of different widths & depths on test functions in various dimensions, including smooth and piecewise smooth functions. We also compare DL against best-in-class methods for smooth function approx. based on compressed sensing (CS). Our main conclusion from these experiments is that there is a crucial gap between the approximation theory of DNNs and their practical performance, with trained DNNs performing relatively poorly on functions for which there are strong approximation results (e.g. smooth functions), yet performing well in comparison to best-in-class methods for other functions. To analyze this gap further, we provide some theoretical insights. We establish a practical existence theorem, asserting existence of a DNN architecture and training procedure that offers the same performance as CS. This establishes a key theoretical benchmark, showing the gap can be closed, albeit via a strategy guaranteed to perform as well as, but no better than, current best-in-class schemes. Nevertheless, it demonstrates the promise of practical DNN approx., by highlighting potential for better schemes through careful design of DNN architectures and training strategies.
翻译:深层次的学习( DL) 正在改变产业,因为深层的神经网络正在根据真实世界的数据进行训练,使决策过程自动化。 部分驱动力来自DNN近似理论的快速扩展文献,表明它们可以近似多种功能,这些工具正越来越多地被考虑用于科学计算方面的问题。 然而,与该领域的传统算法不同,从数字分析原则(例如稳定性、准确性、计算效率和抽样复杂性)中,对于DNNN的计算理论很少了解。 在本文中,我们引入了一个计算框架,用于在实践中检查DNN,并利用它来研究这些问题的经验性表现。我们研究不同宽度和深度的DNNN的文献,显示它们在不同层面的测试性功能,包括光滑和平滑的功能。我们还根据压缩的感测(CS),将DL相对于最优级的功能进行比较。 我们通过这些实验得出的主要结论是, DNNW 的近似理论理论及其实际性表现存在之间有重大差距,但经过训练的DNNNN在功能上表现得相当差, 而对于当前最接近近近近的近的理论性结果显示了。 显示,我们通过这个理论结构可以建立一个更接近的理论结构, 展示一个更接近的功能, 显示一个更接近的理论, 显示一个更接近的理论,我们通过另一个的理论, 显示一个更接近的理论, 显示一个更接近的理论, 显示一个更接近的理论, 显示它的存在。