以革命神经网络为基础,从语音信号中识别孟加拉语口语数字的方法 (A Convolutional Neural Network Based Approach to Recognize Bangla Spoken Digits from Speech Signal)

from arxiv, 4 pages, 5 figures, 2021 International Conference on Electronics, Communications and Information Technology (ICECIT), 14 to 16 September 2021, Khulna, Bangladesh

Speech recognition is a technique that converts human speech signals into text or words or in any form that can be easily understood by computers or other machines. There have been a few studies on Bangla digit recognition systems, the majority of which used small datasets with few variations in genders, ages, dialects, and other variables. Audio recordings of Bangladeshi people of various genders, ages, and dialects were used to create a large speech dataset of spoken '0-9' Bangla digits in this study. Here, 400 noisy and noise-free samples per digit have been recorded for creating the dataset. Mel Frequency Cepstrum Coefficients (MFCCs) have been utilized for extracting meaningful features from the raw speech data. Then, to detect Bangla numeral digits, Convolutional Neural Networks (CNNs) were utilized. The suggested technique recognizes '0-9' Bangla spoken digits with 97.1% accuracy throughout the whole dataset. The efficiency of the model was also assessed using 10-fold crossvalidation, which yielded a 96.7% accuracy.

翻译：语音识别是一种技术,可以将人的语音信号转换成文字或文字或任何形式,计算机或其他机器可以很容易理解。已经对孟加拉数字识别系统进行了一些研究,其中多数使用小型数据集,在性别、年龄、方言和其他变量方面差异不大。孟加拉国不同性别、年龄和方言的人的录音被用于创建本研究中“0-9”孟加拉数字的大型语音数据集。这里,为创建数据集,记录了每位数字400个噪音和无噪音样本。Mel频 Cepstrum Covalics(MFCCs)被用于从原始语音数据中提取有意义的特征。然后,用于检测孟加拉数字、进化神经网络(CNNs)的小型数据集。建议的技术在整个数据集中识别了“0-9”孟加拉口音,准确度达到97.1%。还用10倍的交叉校准来评估模型的效率,得出了96.7%的准确度。

相关内容

Neural Networks

关注 1650

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

近期必读的5篇顶会CVPR 2021【行为识别】相关论文和代码

专知会员服务

60+阅读 · 2021年3月17日