Recent years have witnessed wider adoption of Automated Speech Recognition (ASR) techniques in various domains. Consequently, evaluating and enhancing the quality of ASR systems is of great importance. This paper proposes ASDF, an Automated Speech Recognition Differential Testing Framework for testing ASR systems. ASDF extends an existing ASR testing tool, the CrossASR++, which synthesizes test cases from a text corpus. However, CrossASR++ fails to make use of the text corpus efficiently and provides limited information on how the failed test cases can improve ASR systems. To address these limitations, our tool incorporates two novel features: (1) a text transformation module to boost the number of generated test cases and uncover more errors in ASR systems and (2) a phonetic analysis module to identify on which phonemes the ASR system tend to produce errors. ASDF generates more high-quality test cases by applying various text transformation methods (e.g., change tense) to the texts in failed test cases. By doing so, ASDF can utilize a small text corpus to generate a large number of audio test cases, something which CrossASR++ is not capable of. In addition, ASDF implements more metrics to evaluate the performance of ASR systems from multiple perspectives. ASDF performs phonetic analysis on the identified failed test cases to identify the phonemes that ASR systems tend to transcribe incorrectly, providing useful information for developers to improve ASR systems. The demonstration video of our tool is made online at https://www.youtube.com/watch?v=DzVwfc3h9As. The implementation is available at https://github.com/danielyuenhx/asdf-differential-testing.
翻译:近年来,在各个领域广泛采用了自动语音识别技术(ASR),因此,评估和提高ASR系统的质量非常重要,因此,评估和加强ASR系统的质量非常重要,本文件提议ADF,即自动语音识别差异测试框架,用于测试ASR系统。ASDF扩展了现有的ASR测试工具CrossASR++,即CrossASR++,该测试工具综合了文本库中的测试案例。然而,CrossASR+++未能有效地使用文本库,并且提供了关于失败测试案例如何改进ASR系统的有限信息。为了解决这些局限性,我们的工具包含两个新颖的特征:(1) 文本转换模块,用于增加生成的测试案例的数量,发现ASR系统中更多的错误;(2) 电话分析模块,用以确定ASR系统在哪些电话中往往产生错误。ASDF在应用各种文本转换方法(e.g.,变换压力)来生成质量更高的测试案例。 通过这样做,ASDF可以使用一个小型文本来生成大量的音量测试案例,而CrossASR+15无法从A系统中找出更多的错误。此外,在ASDR系统进行更多的测试。