Character-based neural models have recently proven very useful for many NLP tasks. However, there is a gap of sophistication between methods for learning representations of sentences and words. While most character models for learning representations of sentences are deep and complex, models for learning representations of words are shallow and simple. Also, in spite of considerable research on learning character embeddings, it is still not clear which kind of architecture is the best for capturing character-to-word representations. To address these questions, we first investigate the gaps between methods for learning word and sentence representations. We conduct detailed experiments and comparisons of different state-of-the-art convolutional models, and also investigate the advantages and disadvantages of their constituents. Furthermore, we propose IntNet, a funnel-shaped wide convolutional neural architecture with no down-sampling for learning representations of the internal structure of words by composing their characters from limited, supervised training corpora. We evaluate our proposed model on six sequence labeling datasets, including named entity recognition, part-of-speech tagging, and syntactic chunking. Our in-depth analysis shows that IntNet significantly outperforms other character embedding models and obtains new state-of-the-art performance without relying on any external knowledge or resources.
翻译:以字为基础的神经模型最近被证明对许多国家语言方案的任务非常有用。然而,在学习判决和文字的表述方法之间,存在着一种复杂的尖端差距。虽然学习判决和文字的表述方法大多是深而复杂的,但学习文字的表述模式是浅而简单的。此外,尽管对学习字符嵌入进行了大量研究,但还不清楚哪种结构是捕捉个字对字表达方法的最佳方法。为了解决这些问题,我们首先调查学习文字和句表述方法之间的差距。我们详细试验和比较了各种最先进的组合模型,并调查其组成部分的优劣之处。此外,我们提议IntNet,一个云形形成宽广的革命神经结构,没有从有限的、受监督的培训公司中将自己的字符拼凑成学习文字结构的下层。我们用六个顺序对数据集,包括名称实体的识别、部分精密标记、以及合成组合组合组合组合的模型进行评估。我们深入的分析显示,在不依赖新模型或新模型的情况下,正在大量地利用外部模型和外观。