Developing an automatic part-of-speech (POS) tagging for any new language is considered a necessary step for further computational linguistics methodology beyond tagging, like chunking and parsing, to be fully applied to the language. Many POS disambiguation technologies have been developed for this type of research and there are factors that influence the choice of choosing one. This could be either corpus-based or non-corpus-based. In this paper, we present a review of POS tagging technologies.
翻译:开发对任何新语言的自动部分语音标记被认为是进一步计算语言方法的必要步骤,除了标记之外,还要对语言进行充分应用,例如块块和分解。许多POS分辨技术已经为这种类型的研究开发出来,有些因素影响着选择一种语言的选择。这可以是基于实体的,也可以是非主体的。在本文件中,我们介绍了对POS标记技术的审查。