While end-to-end training of Deep Neural Networks (DNNs) yields state of the art performance in an increasing array of applications, it does not provide insight into, or control over, the features being extracted. We report here on a promising neuro-inspired approach to DNNs with sparser and stronger activations. We use standard stochastic gradient training, supplementing the end-to-end discriminative cost function with layer-wise costs promoting Hebbian ("fire together," "wire together") updates for highly active neurons, and anti-Hebbian updates for the remaining neurons. Instead of batch norm, we use divisive normalization of activations (suppressing weak outputs using strong outputs), along with implicit $\ell_2$ normalization of neuronal weights. Experiments with standard image classification tasks on CIFAR-10 demonstrate that, relative to baseline end-to-end trained architectures, our proposed architecture (a) leads to sparser activations (with only a slight compromise on accuracy), (b) exhibits more robustness to noise (without being trained on noisy data), (c) exhibits more robustness to adversarial perturbations (without adversarial training).
翻译:虽然深神经网络端到端培训(DNNs)在越来越多的应用中产生最新水平的艺术性能,但它并没有提供对正在提取的特征的洞察力或控制。 我们在此报告对DNS采取有希望的神经启发性方法,使用稀释和较强的活性能。 我们使用标准的随机梯度培训,用促进Hebbbian(“火力结合”、“电线结合”)、高活性神经元更新和其余神经元的抗希伯利亚更新的层次成本来补充端到端歧视成本功能。 我们使用分解的激活(使用强力输出抑制微弱的产出),同时使用隐含的 $\ell_2美元神经重量的常规化。 CIFAR-10标准图像分类任务的实验表明,与基线端到端培训的架构相比,我们提议的架构(a) 导致稀弱活性活性(在准确性方面仅稍作妥协 ), (b) 显示噪音的稳健性(未接受关于噪音的培训 ), (c) 展示对抗性对敌性检查(未受过训练) 。