Deep equilibrium networks (DEQs) are a new class of models that eschews traditional depth in favor of finding the fixed point of a single nonlinear layer. These models have been shown to achieve performance competitive with the state-of-the-art deep networks while using significantly less memory. Yet they are also slower, brittle to architectural choices, and introduce potential instability to the model. In this paper, we propose a regularization scheme for DEQ models that explicitly regularizes the Jacobian of the fixed-point update equations to stabilize the learning of equilibrium models. We show that this regularization adds only minimal computational cost, significantly stabilizes the fixed-point convergence in both forward and backward passes, and scales well to high-dimensional, realistic domains (e.g., WikiText-103 language modeling and ImageNet classification). Using this method, we demonstrate, for the first time, an implicit-depth model that runs with approximately the same speed and level of performance as popular conventional deep networks such as ResNet-101, while still maintaining the constant memory footprint and architectural simplicity of DEQs. Code is available at https://github.com/locuslab/deq .
翻译:深平衡网络( DEQs) 是一种新的模型类别, 避免传统的深度, 以寻找单一非线性层的固定点。 这些模型在使用记忆量少得多的同时, 已经证明能够与最先进的深层网络实现性能竞争, 但是它们也比较慢, 不利于建筑选择, 并给模型带来潜在的不稳定性。 在本文中, 我们为 DEQ 模型提出了一个规范方案, 明确规范了 固定点更新方程式的 Jacobian, 以稳定对均衡模式的学习。 我们显示, 这种正规化只增加了最低的计算成本, 大大稳定了前向和后通道的固定点趋同, 以及高度、 现实的地域( 例如, WikitText- 103 语言模型和图像网络分类) 。 我们首次展示了一种隐含深度的模式, 其运行速度和水平与ResNet- 101 等流行的常规深层网络大致相同, 同时保持 DEQs 的不断记忆足迹和建筑简单性 。 代码可在 https://githruthub.com/loclicliblablabs 。