Deep neural networks (DNNs) have proven successful in a wide variety of applications such as speech recognition and synthesis, computer vision, machine translation, and game playing, to name but a few. However, existing deep neural network models are computationally expensive and memory intensive, hindering their deployment in devices with low memory resources or in applications with strict latency requirements. Therefore, a natural thought is to perform model compression and acceleration in deep networks without significantly decreasing the model performance, which is what we call reducing the complexity. In the following work, we try reducing the complexity of state of the art LSTM models for natural language tasks such as text classification, by distilling their knowledge to CNN based models, thus reducing the inference time(or latency) during testing.
翻译:深神经网络(DNNs)在语音识别和合成、计算机视觉、机器翻译和游戏等多种应用中证明是成功的,但仅举几个例子。然而,现有的深神经网络模型在计算上成本昂贵,记忆力很强,阻碍了在记忆力低的装置或严格隐蔽要求的应用中部署这些模型。因此,自然考虑是在深网络中进行模型压缩和加速,而不会显著降低模型性能,这就是我们所称的降低复杂性的模型性能。在接下来的工作中,我们试图通过向有线电视新闻网模型提炼知识,从而降低测试过程中的推论时间(或延时),从而降低LSTM模型在文本分类等自然语言任务方面的复杂程度。