Deep neural networks have demonstrated their superior performance in almost every Natural Language Processing task, however, their increasing complexity raises concerns. In particular, these networks require high expenses on computational hardware, and training budget is a concern for many. Even for a trained network, the inference phase can be too demanding for resource-constrained devices, thus limiting its applicability. The state-of-the-art transformer models are a vivid example. Simplifying the computations performed by a network is one way of relaxing the complexity requirements. In this paper, we propose an end to end binarized neural network architecture for the intent classification task. In order to fully utilize the potential of end to end binarization, both input representations (vector embeddings of tokens statistics) and the classifier are binarized. We demonstrate the efficiency of such architecture on the intent classification of short texts over three datasets and for text classification with a larger dataset. The proposed architecture achieves comparable to the state-of-the-art results on standard intent classification datasets while utilizing ~ 20-40% lesser memory and training time. Furthermore, the individual components of the architecture, such as binarized vector embeddings of documents or binarized classifiers, can be used separately with not necessarily fully binary architectures.
翻译:深心神经网络在几乎每一个自然语言处理任务中都表现出了优异的性能,然而,其日益复杂的复杂性引起了人们的关注。特别是,这些网络需要计算硬件的高昂费用,而培训预算也是许多人关注的一个问题。即使对经过培训的网络来说,对资源限制的装置来说,推论阶段可能过于苛刻,从而限制了其适用性。最先进的变压器模型是一个生动的例子。简化一个网络的计算是放松复杂要求的一种方式。在本文件中,我们提议结束目标分类任务的二进制神经网络结构。为了充分利用最终的二进制潜力。为了充分利用结束二进制的潜力,输入演示(象征数据的存储器嵌入)和分类器都是二进制化的。我们展示了这种结构在三个数据集上对短文本进行意图分类以及用更大的数据集对文本分类的效率。拟议的结构在标准目的分类结果上取得了可比性,同时利用了20-40 % 的记忆量和训练时间。此外,单个结构的单个组件可以被单独地嵌入,作为硬体结构的硬体,可以被完全使用。