We introduce a new method for speeding up the inference of deep neural networks. It is somewhat inspired by the reduced-order modeling techniques for dynamical systems.The cornerstone of the proposed method is the maximum volume algorithm. We demonstrate efficiency on neural networks pre-trained on different datasets. We show that in many practical cases it is possible to replace convolutional layers with much smaller fully-connected layers with a relatively small drop in accuracy.
翻译:我们引入了一种新方法来加速深神经网络的推论,这在某种程度上受到动态系统减序模型技术的启发。.拟议方法的基石是最大量算法。 我们展示了神经网络在对不同数据集进行预先培训之前的效率。 我们显示,在很多实际情况下,可以用小得多的完全连接层来取代富集层,而精确度则较低。