While some convolutional neural networks (CNNs) have surpassed human visual abilities in object classification, they often struggle to recognize objects in images corrupted with different types of common noise patterns, highlighting a major limitation of this family of models. Recently, it has been shown that simulating a primary visual cortex (V1) at the front of CNNs leads to small improvements in robustness to these image perturbations. In this study, we start with the observation that different variants of the V1 model show gains for specific corruption types. We then build a new model using an ensembling technique, which combines multiple individual models with different V1 front-end variants. The model ensemble leverages the strengths of each individual model, leading to significant improvements in robustness across all corruption categories and outperforming the base model by 38% on average. Finally, we show that using distillation, it is possible to partially compress the knowledge in the ensemble model into a single model with a V1 front-end. While the ensembling and distillation techniques used here are hardly biologically-plausible, the results presented here demonstrate that by combining the specific strengths of different neuronal circuits in V1 it is possible to improve the robustness of CNNs for a wide range of perturbations.
翻译:虽然一些连锁神经网络(CNNs)在物体分类方面超过了人的视觉能力,但它们往往在努力辨别以不同类型常见噪音模式腐蚀的图像中的物体,突出显示这种模型组的主要局限性。 最近,人们发现,在CNN的正面模拟原始视觉皮层(V1),使这些图像扰动的强度稍有改善。在本研究中,我们首先发现V1模型的不同变体显示特定腐败类型的增益。然后我们利用混合技术建立一个新模型,将多种个人模型与不同的V1前端变异组合在一起。模型组群利用了每个模型的长处,导致所有腐败类别中的稳健度显著提高,平均比基本模型高38%。最后,我们表明,利用蒸馏,有可能部分地将组合模型中的知识压缩成一个具有V1前端特点的单一模型。在这里使用的组合和蒸馏技术几乎无法将多种不同的V1前端模型组合在一起。模型组合和蒸馏技术利用每个模型的强度来利用每个模型的强度优势,这里展示的神经的强度的强度是不同的。