Implicit models separate the definition of a layer from the description of its solution process. While implicit layers allow features such as depth to adapt to new scenarios and inputs automatically, this adaptivity makes its computational expense challenging to predict. In this manuscript, we increase the "implicitness" of the DEQ by redefining the method in terms of an infinite time neural ODE, which paradoxically decreases the training cost over a standard neural ODE by 2-4x. Additionally, we address the question: is there a way to simultaneously achieve the robustness of implicit layers while allowing the reduced computational expense of an explicit layer? To solve this, we develop Skip and Skip Reg. DEQ, an implicit-explicit (IMEX) layer that simultaneously trains an explicit prediction followed by an implicit correction. We show that training this explicit predictor is free and even decreases the training time by 1.11-3.19x. Together, this manuscript shows how bridging the dichotomy of implicit and explicit deep learning can combine the advantages of both techniques.
翻译:隐含的模型将某一层的定义与其解决方案过程的描述区分开来。 虽然隐含的层层允许诸如深度等特征自动适应新的情景和投入, 但这种适应性却使其计算成本难以预测。 在本手稿中,我们通过重新定义一个无限时间神经值来增加DEQ的“隐含性 ” 方法, 其矛盾之处是, 将标准神经值的训练成本减少2-4x。 此外, 我们处理的问题是: 是否有办法同时实现隐含层的稳健性, 同时又允许减少一个显性层的计算费用? 为了解决这个问题, 我们开发了 跳过和跳过 Reg. DEQ, 这是一种隐含的( IDO) 层, 并同时培养一个隐含的预测, 然后进行隐含的更正。 我们显示, 培训这个明确的预测器是免费的, 甚至将培训时间减少1.11-3.19x。 。 共同, 这份手稿表明, 隐含的和明确的深层学习的对立能够将两种技术的优势结合起来。</s>