The incidence rate of voice diseases is increasing year by year. The use of software for remote diagnosis is a technical development trend and has important practical value. Among voice diseases, common diseases that cause hoarseness include spasmodic dysphonia, vocal cord paralysis, vocal nodule, and vocal cord polyp. This paper presents a voice disease detection method that can be applied in a wide range of clinical. We cooperated with Xiangya Hospital of Central South University to collect voice samples from sixty-one different patients. The Mel Frequency Cepstrum Coefficient (MFCC) parameters are extracted as input features to describe the voice in the form of data. An innovative model combining MFCC parameters and single convolution layer CNN is proposed for fast calculation and classification. The highest accuracy we achieved was 92%, it is fully ahead of the original research results and internationally advanced. And we use Advanced Voice Function Assessment Databases (AVFAD) to evaluate the generalization ability of the method we proposed, which achieved an accuracy rate of 98%. Experiments on clinical and standard datasets show that for the pathological detection of voice diseases, our method has greatly improved in accuracy and computational efficiency.
翻译:语音疾病的发病率逐年增加。使用软件进行远程诊断是一种技术发展趋势,具有重要的实际价值。在语音疾病中,导致嗓音沙哑的常见疾病包括痉挛性发音障碍、声带瘫痪、声带小结和声带息肉。本文提出了一种语音疾病检测方法,可应用于广泛的临床。我们与中南大学湘雅医院合作,从六十一个不同的病人收集了语音样本。采用梅尔频率倒谱系数(MFCC)参数作为输入特征,以数据形式描述语音。提出了一种结合MFCC参数和单一卷积层CNN的创新模型,用于快速计算和分类。我们实现的最高准确率为92%,完全领先于原始研究结果和国际领先水平。我们使用先进的语音功能评估数据库(AVFAD)评估了我们提出的方法的泛化能力,其准确率达到了98%。对临床和标准数据集的实验表明,对于语音疾病的病理学检测,我们的方法在准确性和计算效率方面均有显著提高。