This study uses a character level neural machine translation approach trained on a long short-term memory-based bi-directional recurrent neural network architecture for diacritization of Medieval Arabic. The results improve from the online tool used as a baseline. A diacritization model have been published openly through an easy to use Python package available on PyPi and Zenodo. We have found that context size should be considered when optimizing a feasible prediction model.
翻译:这项研究使用一个性格级神经机器翻译方法,该方法在长期短期内存的双向经常性神经网络结构方面受过培训,目的是对中世纪阿拉伯语进行分权。其结果从作为基线的在线工具中得到改进。一个分权模型通过方便地使用PyPi和Zenodo的Python软件包公开发布。我们发现,在优化可行的预测模型时,应考虑上下文大小。