The output distribution of a neural network (NN) over the entire input space captures the complete input-output mapping relationship, offering insights toward a more comprehensive NN understanding. Exhaustive enumeration or traditional Monte Carlo methods for the entire input space can exhibit impractical sampling time, especially for high-dimensional inputs. To make such difficult sampling computationally feasible, in this paper, we propose a novel Gradient-based Wang-Landau (GWL) sampler. We first draw the connection between the output distribution of a NN and the density of states (DOS) of a physical system. Then, we renovate the classic sampler for the DOS problem, the Wang-Landau algorithm, by replacing its random proposals with gradient-based Monte Carlo proposals. This way, our GWL sampler investigates the under-explored subsets of the input space much more efficiently. Extensive experiments have verified the accuracy of the output distribution generated by GWL and also showcased several interesting findings - for example, in a binary image classification task, both CNN and ResNet mapped the majority of human unrecognizable images to very negative logit values.
翻译:整个输入空间的神经网络(NN)的输出分布捕捉了完整的输入-输出映射关系,为更全面地了解NN提供了洞察力。整个输入空间的排查或传统的蒙特卡洛方法可能显示不切实际的取样时间,特别是对于高维输入。为了使这种困难的采样在计算上可行,我们在本文件中提议了一个新的基于Wang-Landau(GWL)的Gradient(GWL)取样器。我们首先绘制了NN的输出分布与物理系统国家密度(DOS)的密度(DOS)之间的连接。然后,我们更新了DOS问题的典型采样器,即Wang-Landau算法,用基于梯度的蒙特卡洛建议取代了它的随机建议。这样,我们的GWL采样器对输入空间中未充分勘探的子集进行了更高效的调查。广泛的实验核实了GWL产生的输出分布的准确性,并展示了几个有趣的结果,例如,在二进图像分类工作中,CNNNM和ResNet都将大多数人类未解的图像绘制为非常负面的登录值。