This work addresses adversarial robustness in deep learning by considering deep networks with stochastic local winner-takes-all (LWTA) activations. This type of network units result in sparse representations from each model layer, as the units are organized in blocks where only one unit generates a non-zero output. The main operating principle of the introduced units lies on stochastic arguments, as the network performs posterior sampling over competing units to select the winner. We combine these LWTA arguments with tools from the field of Bayesian non-parametrics, specifically the stick-breaking construction of the Indian Buffet Process, to allow for inferring the sub-part of each layer that is essential for modeling the data at hand. Then, inference is performed by means of stochastic variational Bayes. We perform a thorough experimental evaluation of our model using benchmark datasets. As we show, our method achieves high robustness to adversarial perturbations, with state-of-the-art performance in powerful adversarial attack schemes.
翻译:这项工作通过考虑与具有随机性的地方赢家-全赢家-全赢家-启动(LWTA)的深网络,解决深层次学习中的对抗性强。这类网络单位导致每个模型层的表达方式稀少,因为每个模型层的分块中只有一个单元产生非零产出。引入单元的主要操作原则在于随机论,因为网络对竞争单位进行后继取样以选择获胜者。我们把这些LWTA的参数与巴耶斯非参数领域的工具,特别是印度Buffet进程的破碎工程结合起来,以便推断每个模型层中对于模拟手头数据至关重要的分层。然后,推论通过随机变异波段手段进行。我们用基准数据集对模型进行彻底的实验性评估。正如我们所显示的那样,我们的方法实现了对对抗性侵扰的高度稳健性,在强大的对立性攻击计划中具有最先进的性能。