This tutorial paper surveys provably optimal alternatives to end-to-end backpropagation (E2EBP) -- the de facto standard for training deep architectures. Modular training refers to strictly local training without both the forward and the backward pass, i.e., dividing a deep architecture into several nonoverlapping modules and training them separately without any end-to-end operation. Between the fully global E2EBP and the strictly local modular training, there are weakly modular hybrids performing training without the backward pass only. These alternatives can match or surpass the performance of E2EBP on challenging datasets such as ImageNet, and are gaining increasing attention primarily because they offer practical advantages over E2EBP, which will be enumerated herein. In particular, they allow for greater modularity and transparency in deep learning workflows, aligning deep learning with the mainstream computer science engineering that heavily exploits modularization for scalability. Modular training has also revealed novel insights about learning and has further implications on other important research domains. Specifically, it induces natural and effective solutions to some important practical problems such as data efficiency and transferability estimation.
翻译:这份指导性文件调查(E2EBP)是培训深层建筑的实际标准(E2EBP)的最佳选择。模块培训是指严格当地培训,没有前向和后向通道,即将深层建筑分为几个互不重叠模块,不进行任何端至端操作,对其进行单独培训。在全全球E2EBP和严格的本地模块化培训之间,有模块化混合体在不使用后向通道的情况下进行培训。这些替代体可以匹配或超过E2EBP在具有挑战性的数据集(如图像网络)方面的绩效,并日益受到关注,这主要是因为它们为E2EBPP提供了实用优势,而E2EBP(这里将对此加以列举)。特别是,它们使深层学习工作流程更加模块化和透明,使深层学习与大量利用模块化以可扩展性进行的主流计算机科学工程相协调。模块化培训还揭示了学习的新见解,并对其他重要研究领域产生了进一步的影响。具体地说,它为数据效率和可转移性估算等一些重要的实际问题提供了自然和有效的解决办法。