Compression has emerged as one of the essential deep learning research topics, especially for the edge devices that have limited computation power and storage capacity. Among the main compression techniques, low-rank compression via matrix factorization has been known to have two problems. First, an extensive tuning is required. Second, the resulting compression performance is typically not impressive. In this work, we propose a low-rank compression method that utilizes a modified beam-search for an automatic rank selection and a modified stable rank for a compression-friendly training. The resulting BSR (Beam-search and Stable Rank) algorithm requires only a single hyperparameter to be tuned for the desired compression ratio. The performance of BSR in terms of accuracy and compression ratio trade-off curve turns out to be superior to the previously known low-rank compression methods. Furthermore, BSR can perform on par with or better than the state-of-the-art structured pruning methods. As with pruning, BSR can be easily combined with quantization for an additional compression.
翻译:压缩已成为基本的深层学习研究课题之一,特别是对于计算功率和存储容量有限的边缘装置而言,压缩已成为重要的深层研究课题之一。在主要的压缩技术中,已知通过矩阵因子化的低级压缩存在两个问题。首先,需要进行广泛的调整。其次,由此产生的压缩性能通常并不令人印象深刻。在这项工作中,我们建议采用低级压缩方法,对自动级别选择采用经修改的波束搜索,对压缩友好型培训采用经修改的稳定等级。因此产生的BSR(Baam-Search和Stair Rank)算法只需要一个单一的超参数来调整预期的压缩率。BSR在精确度和压缩比率交易曲线方面的性能最终优于以前已知的低级压缩法。此外,BSR可以与最先进的结构裁剪裁剪法相当或更好。与Pruning一样,BSR可以很容易与额外压缩的平整合并。