With the fast development of Deep Learning techniques, Named Entity Recognition (NER) is becoming more and more important in the information extraction task. The greatest difficulty that the NER task faces is to keep the detectability even when types of NE and documents are unfamiliar. Realizing that the specificity information may contain potential meanings of a word and generate semantic-related features for word embedding, we develop a distribution-aware word embedding and implement three different methods to make use of the distribution information in a NER framework. And the result shows that the performance of NER will be improved if the word specificity is incorporated into existing NER methods.
翻译:随着深层学习技术的快速发展,命名实体识别(NER)在信息提取任务中变得越来越重要。净化任务面临的最大困难是即使在不熟悉NE和文件类型的情况下也要保持可探测性。认识到具体信息可能包含一个单词的潜在含义,并产生词嵌入的语义相关特征,我们开发了一个有分布意识的单词嵌入,并采用三种不同方法在净化框架内使用发布信息。结果显示,如果将单词特性纳入现有的净化方法,净化的性能将得到改善。