In this paper, we investigate the application of end-to-end and multi-module frameworks for G2P conversion for the Persian language. The results demonstrate that our proposed multi-module G2P system outperforms our end-to-end systems in terms of accuracy and speed. The system consists of a pronunciation dictionary as our look-up table, along with separate models to handle homographs, OOVs and ezafe in Persian created using GRU and Transformer architectures. The system is sequence-level rather than word-level, which allows it to effectively capture the unwritten relations between words (cross-word information) necessary for homograph disambiguation and ezafe recognition without the need for any pre-processing. After evaluation, our system achieved a 94.48% word-level accuracy, outperforming the previous G2P systems for Persian.
翻译:在本文中,我们调查了波斯语G2P转换的端到端和多模块框架的应用情况。结果显示,我们提议的多模块G2P系统在准确性和速度方面优于我们的端到端系统。这个系统包括一个读音字典,作为我们的搜索表,以及用 GRU 和 变换器结构来处理波斯语同源、 OOOOVs 和 ezafe 的单独模型。这个系统是序列级别,而不是字级,它能够有效地捕捉同质法脱钩和ezafe 识别所需的单词(跨词信息)之间的非书面关系,而无需预处理。经过评估后,我们的系统实现了94.48%的字级精度,超过了以前的波斯语G2P系统。