Deep neural nets typically perform end-to-end backpropagation to learn the weights, a procedure that creates synchronization constraints in the weight update step across layers and is not biologically plausible. Recent advances in unsupervised contrastive representation learning point to the question of whether a learning algorithm can also be made local, that is, the updates of lower layers do not directly depend on the computation of upper layers. While Greedy InfoMax separately learns each block with a local objective, we found that it consistently hurts readout accuracy in state-of-the-art unsupervised contrastive learning algorithms, possibly due to the greedy objective as well as gradient isolation. In this work, we discover that by overlapping local blocks stacking on top of each other, we effectively increase the decoder depth and allow upper blocks to implicitly send feedbacks to lower blocks. This simple design closes the performance gap between local learning and end-to-end contrastive learning algorithms for the first time. Aside from standard ImageNet experiments, we also show results on complex downstream tasks such as object detection and instance segmentation directly using readout features.
翻译:深神经网通常会进行端到端的反向演化,以学习重量,这是造成跨层重力更新步骤同步性限制的一种程序,在生物上不可信。未经监督的对比代表性学习最近的进展表明,学习算法是否也可以本地化,即低层的更新并不直接取决于对上层的计算。虽然贪婪信息Max 分别以本地目标学习每个区块,但我们发现它始终会损害最先进的、不受监督的对比性学习算法的读取准确性, 这可能是由于贪婪的目的以及梯度隔离。 在这项工作中,我们发现通过重叠的本地区块相互叠叠叠, 我们有效地提高了解码深度, 并允许上层块向下层块发送反馈。 这个简单的设计首次缩小了本地学习与端到端对比性学习算法之间的性能差距。 除了标准的图像网实验之外, 我们还展示了复杂的下游任务的结果, 如直接使用读出功能的物体探测和实例分割等。