The term natural language refers to any system of symbolic communication (spoken, signed or written) without intentional human planning and design. This distinguishes natural languages such as Arabic and Japanese from artificially constructed languages such as Esperanto or Python. Natural language processing (NLP) is the sub-field of artificial intelligence (AI) focused on modeling natural languages to build applications such as speech recognition and synthesis, machine translation, optical character recognition (OCR), sentiment analysis (SA), question answering, dialogue systems, etc. NLP is a highly interdisciplinary field with connections to computer science, linguistics, cognitive science, psychology, mathematics and others. Some of the earliest AI applications were in NLP (e.g., machine translation); and the last decade (2010-2020) in particular has witnessed an incredible increase in quality, matched with a rise in public awareness, use, and expectations of what may have seemed like science fiction in the past. NLP researchers pride themselves on developing language independent models and tools that can be applied to all human languages, e.g. machine translation systems can be built for a variety of languages using the same basic mechanisms and models. However, the reality is that some languages do get more attention (e.g., English and Chinese) than others (e.g., Hindi and Swahili). Arabic, the primary language of the Arab world and the religious language of millions of non-Arab Muslims is somewhere in the middle of this continuum. Though Arabic NLP has many challenges, it has seen many successes and developments. Next we discuss Arabic's main challenges as a necessary background, and we present a brief history of Arabic NLP. We then survey a number of its research areas, and close with a critical discussion of the future of Arabic NLP.
翻译:自然语言处理(NLP)是人工智能(AI)的子领域,其重点是模拟自然语言,以构建语言识别和合成、机器翻译、光学字符识别(OCR)、情绪分析(SA)、问答、对话系统等等应用。NLP是一个高度跨学科的领域,与计算机科学、语言、认知科学、心理学、数学和其他方面有联系。一些最早的AI应用在NLP(例如机器翻译)中,特别是过去十年(2010-2020年)中,自然语言处理(NLP)是人工智能(AI)的子领域,侧重于模拟自然语言,以构建语言识别和合成、机器翻译、光学品识别(OCR)、情感分析(SA)、感知解答、对话系统是一个高度跨学科的领域,例如,机器翻译系统可以建立各种语言,使用同样基本的阿拉伯语言(例如,现在的阿拉伯、现在的、现在的阿拉伯的、现在的、现在的、现在的阿拉伯的、现在的、现在的、现在的阿拉伯的、现在的、现在的、现在的、现在的、现在的、现在的阿拉伯的、现在的、现在的、现在的、现在的阿拉伯的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的阿拉伯的、现在的、现在的阿拉伯的阿拉伯的、现在的、现在的、现在的、许多的、现在的、现在的、现在的、现在的、现在的、现在的、现在的阿拉伯的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的阿拉伯的阿拉伯的、现在的、现在的、现在的、现在的、现在的、现在的阿拉伯的阿拉伯的、现在的阿拉伯的、现在的、现在的、现在的、现在的、现在的、现在的阿拉伯的、现在的、现在的、现在的阿拉伯的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的阿拉伯的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、现在的、以及将来的、现在的、许多的、许多的、