With time, machine learning models have increased in their scope, functionality and size. Consequently, the increased functionality and size of such models requires high-end hardware to both train and provide inference after the fact. This paper aims to explore the possibilities within the domain of model compression and discuss the efficiency of each of the possible approaches while comparing model size and performance with respect to pre- and post-compression.
翻译:随着时间的推移,机器学习模式的范围、功能和规模都有所增加,因此,由于这些模式的功能和规模的扩大,既需要高端硬件进行培训和事后提供推论,本文件旨在探讨在模型压缩领域的可能性,并讨论每一种可能办法的效率,同时比较模型在压缩前和压缩后的规模和性能。