In Machine Learning, Artificial Neural Networks (ANNs) are a very powerful tool, broadly used in many applications. Often, the selected (deep) architectures include many layers, and therefore a large amount of parameters, which makes training, storage and inference expensive. This motivated a stream of research about compressing the original networks into smaller ones without excessively sacrificing performances. Among the many proposed compression approaches, one of the most popular is \emph{pruning}, whereby entire elements of the ANN (links, nodes, channels, \ldots) and the corresponding weights are deleted. Since the nature of the problem is inherently combinatorial (what elements to prune and what not), we propose a new pruning method based on Operational Research tools. We start from a natural Mixed-Integer-Programming model for the problem, and we use the Perspective Reformulation technique to strengthen its continuous relaxation. Projecting away the indicator variables from this reformulation yields a new regularization term, which we call the Structured Perspective Regularization, that leads to structured pruning of the initial architecture. We test our method on some ResNet architectures applied to CIFAR-10, CIFAR-100 and ImageNet datasets, obtaining competitive performances w.r.t.~the state of the art for structured pruning.
翻译:在机器学习中,人工神经网络(ANNS)是一个非常强大的工具,在许多应用中广泛使用。通常,所选(深)建筑包括许多层,因此包含大量参数,使得培训、储存和推断费用昂贵。这促使人们进行了一系列研究,将原始网络压缩成较小的网络,而不过度牺牲性能。在许多拟议的压缩方法中,最受欢迎的之一是\emph{pruning},据此删除ANN的全部要素(链接、节点、频道、焊多茨)和相应的重量。由于问题的性质本质上是组合性的(哪些元素是原始的,哪些是非的),我们建议了一种基于操作研究工具的新的调整方法。我们从一个自然混合- Integer-Progragramme模型开始研究问题,我们使用透视重调整技术来加强其持续放松。预测从这次重新拟订的指标变量产生一个新的正规化术语,我们称之为结构化,从而导致对初始结构结构进行结构化(什么是原始元素的元素),因此我们用运行的新方法(什么是原始元素)提出一个新的裁剪裁剪方法。我们测试了某种图像-10号结构结构结构结构。我们用来测试了一种图像网络的系统,我们测试了某种业绩结构结构-10号。我们应用了一种CIR-10 测试了我们应用了某种业绩结构结构结构结构结构结构。