Speech recognition is a fascinating process that offers the opportunity to interact and command the machine in the field of human-computer interactions. Speech recognition is a language-dependent system constructed directly based on the linguistic and textual properties of any language. Automatic Speech Recognition (ASR) systems are currently being used to translate speech to text flawlessly. Although ASR systems are being strongly executed in international languages, ASR systems' implementation in the Bengali language has not reached an acceptable state. In this research work, we sedulously disclose the current status of the Bengali ASR system's research endeavors. In what follows, we acquaint the challenges that are mostly encountered while constructing a Bengali ASR system. We split the challenges into language-dependent and language-independent challenges and guide how the particular complications may be overhauled. Following a rigorous investigation and highlighting the challenges, we conclude that Bengali ASR systems require specific construction of ASR architectures based on the Bengali language's grammatical and phonetic structure.
翻译:语音识别是一个令人着迷的过程,它提供了在人与计算机互动领域互动和指挥机器的机会。语音识别是一个直接基于任何语言的语言和文字特性而建立的一个依赖语言的系统。目前正在使用自动语音识别系统将语言翻译为文本,尽管ASR系统正在用国际语言大力实施,但孟加拉语的ASR系统尚未达到可接受的程度。在这个研究工作中,我们沉闷地披露孟加拉语ASR系统研究工作的现状。接下来,我们了解在建设孟加拉语的ASR系统时遇到的大多数挑战。我们把挑战分为依赖语言和语言的挑战,并指导如何对特定复杂问题进行彻底改造。经过严格调查和突出挑战之后,我们得出结论,孟加拉语的ASR系统需要根据孟加拉语的语语语语语语和语调结构具体构建ASR结构。