Speech recognition systems for the Spanish language, such as Google's, produce errors quite frequently when used in applications of a specific domain. These errors mostly occur when recognizing words new to the recognizer's language model or ad hoc to the domain. This article presents an algorithm that uses Levenshtein distance on phonemes to reduce the speech recognizer's errors. The preliminary results show that it is possible to correct the recognizer's errors significantly by using this metric and using a dictionary of specific phrases from the domain of the application. Despite being designed for particular domains, the algorithm proposed here is of general application. The phrases that must be recognized can be explicitly defined for each application, without the algorithm having to be modified. It is enough to indicate to the algorithm the set of sentences on which it must work. The algorithm's complexity is $O(tn)$, where $t$ is the number of words in the transcript to be corrected, and $n$ is the number of phrases specific to the domain.
翻译:谷歌等西班牙语语音识别系统在特定域的应用中使用时经常产生错误。 这些错误大多发生在识别识别者语言模型或特定域的新词汇时。 本条提供了一种算法, 使用电话上的Levenshtein 距离来减少语音识别者的错误。 初步结果表明, 使用该计量法和应用程序域内具体词词典可以大大纠正识别者的错误。 尽管此处提议的算法是为特定域设计的, 但具有一般应用性。 必须为每个应用程序明确定义必须识别的词句, 而无需修改算法。 足够在算法中指明它必须使用的句子组。 算法的复杂性是$O( tn)$, 其中$t是要更正的字数, $是域内具体词数。