One strategy for adversarially training a robust model is to maximize its certified radius -- the neighborhood around a given training sample for which the model's prediction remains unchanged. The scheme typically involves analyzing a "smoothed" classifier where one estimates the prediction corresponding to Gaussian samples in the neighborhood of each sample in the mini-batch, accomplished in practice by Monte Carlo sampling. In this paper, we investigate the hypothesis that this sampling bottleneck can potentially be mitigated by identifying ways to directly propagate the covariance matrix of the smoothed distribution through the network. To this end, we find that other than certain adjustments to the network, propagating the covariances must also be accompanied by additional accounting that keeps track of how the distributional moments transform and interact at each stage in the network. We show how satisfying these criteria yields an algorithm for maximizing the certified radius on datasets including Cifar-10, ImageNet, and Places365 while offering runtime savings on networks with moderate depth, with a small compromise in overall accuracy. We describe the details of the key modifications that enable practical use. Via various experiments, we evaluate when our simplifications are sensible, and what the key benefits and limitations are.
翻译:用于对准训练强势模型的一种策略是最大限度地扩大其经认证的半径 -- -- 即该模型预测保持不变的某个培训样本周围的周边。该计划通常涉及分析一个“吸附式”分类器,在其中对小批中每个样本附近高萨样本的预测进行估计,这是由蒙特卡洛取样在实践中完成的。在本文中,我们研究了这样一个假设,即通过确定直接传播网络中平滑分布的动态矩阵的方法,这一取样瓶颈有可能减轻。为此,我们发现,除了对网络的某些调整之外,传播共变器还必须辅之以额外的核算,跟踪网络中每个阶段的分配时刻的变化和互动情况。我们表明,满足这些标准如何产生一种算法,使包括Cifar-10、图象网和Place365在内的经认证的半径最大化,同时在网络上以适度深度节省运行时间,并在整体精确度上作出微小的折中。我们描述了能够实际使用的关键修改的细节。我们通过各种实验,评估我们的简化程度和关键的好处是什么。