Model compression is significant for the wide adoption of Recurrent Neural Networks (RNNs) in both user devices possessing limited resources and business clusters requiring quick responses to large-scale service requests. This work aims to learn structurally-sparse Long Short-Term Memory (LSTM) by reducing the sizes of basic structures within LSTM units, including input updates, gates, hidden states, cell states and outputs. Independently reducing the sizes of basic structures can result in inconsistent dimensions among them, and consequently, end up with invalid LSTM units. To overcome the problem, we propose Intrinsic Sparse Structures (ISS) in LSTMs. Removing a component of ISS will simultaneously decrease the sizes of all basic structures by one and thereby always maintain the dimension consistency. By learning ISS within LSTM units, the obtained LSTMs remain regular while having much smaller basic structures. Based on group Lasso regularization, our method achieves 10.59x speedup without losing any perplexity of a language modeling of Penn TreeBank dataset. It is also successfully evaluated through a compact model with only 2.69M weights for machine Question Answering of SQuAD dataset. Our approach is successfully extended to non- LSTM RNNs, like Recurrent Highway Networks (RHNs). Our source code is publicly available at https://github.com/wenwei202/iss-rnns
翻译:模型压缩对于广泛采用经常神经网络十分重要,因为其用户装置拥有有限的资源,而业务集群则需要对大规模服务请求作出迅速反应。这项工作的目的是通过减少LSTM单元内基本结构的规模,包括输入更新、大门、隐藏状态、单元格状态和产出,学习结构扭曲的长期短期内存(LSTM),从而减少LSTM单元内基本结构的规模,包括输入更新、大门、隐藏状态、单元格状态和产出。独立地缩小基本结构的规模,可造成这些结构的尺寸不一致,从而最终导致无效的LSTM单元。为了克服问题,我们提议在LSTMS中采用Intrinsic Sprass结构(IS)。删除国际空间站的一个组件将同时减少所有基本结构的大小,从而始终保持维度的一致性。通过LSTM单元内的基础设施基础设施基础设施学习,获得的LSTMs保持正常状态,而基本结构要小得多。根据Lasso的规范,我们的方法可以实现10.59x速度,而不会失去对Pen TreeBank数据集进行任何令人困惑的模型。我们通过一个只有2.69M重量/M重量的压缩的压缩模型来成功地评价。我们的SRCRRCS的Serus Rex 的公开代码是我们的Serus的Serus 。Serus commus 。像的正常的常规源。