Deep learning approaches have provided state-of-the-art performance in many applications by relying on extremely large and heavily overparameterized neural networks. However, such networks have been shown to be very brittle, not generalize well to new uses cases, and are often difficult if not impossible to deploy on resources limited platforms. Model pruning, i.e., reducing the size of the network, is a widely adopted strategy that can lead to more robust and generalizable network -- usually orders of magnitude smaller with the same or even improved performance. While there exist many heuristics for model pruning, our understanding of the pruning process remains limited. Empirical studies show that some heuristics improve performance while others can make models more brittle or have other side effects. This work aims to shed light on how different pruning methods alter the network's internal feature representation, and the corresponding impact on model performance. To provide a meaningful comparison and characterization of model feature space, we use three geometric metrics that are decomposed from the common adopted classification loss. With these metrics, we design a visualization system to highlight the impact of pruning on model prediction as well as the latent feature embedding. The proposed tool provides an environment for exploring and studying differences among pruning methods and between pruned and original model. By leveraging our visualization, the ML researchers can not only identify samples that are fragile to model pruning and data corruption but also obtain insights and explanations on how some pruned models achieve superior robustness performance.
翻译:深层学习方法在许多应用中提供了最先进的性能,依靠极为庞大和严重过度分解的神经网络。然而,这些网络被证明非常微弱,不能全面适用于新用途案例,而且往往很难甚至不可能在资源有限的平台上部署。模型修剪,即缩小网络的规模,是一个广泛采用的战略,可以导致更强大和可推广的网络 -- -- 通常规模小于规模,同样或甚至改进性能。虽然模型修剪存在许多超强的神经网络,但我们对裁剪过程的理解仍然有限。根据经验,研究表明,有些超强性能改进了性能,而另一些则可以使模型变得更弱或具有其他副作用。这项工作旨在说明不同的裁剪裁方法如何改变网络的内部特征代表,以及对模型性能的相应影响。为了对模型特征空间进行有意义的比较和定性,我们只能使用三种与通用分类损失分解的稳健度度度度度度度度度,但我们对裁过程的了解仍然有限。由于这些衡量指标,我们设计了某些超度性能性能性能改进性能,而另一些则可以使模型产生更精确的性能性能性能性能系统,用以研究原始性能的模型和模型。我们用来研究模型的模型和模型和模型的模拟分析方法。我们之间如何分析。我们研究。我们研究,以便研究一种模拟地分析。