Objective: Voice disorders significantly compromise individuals' ability to speak in their daily lives. Without early diagnosis and treatment, these disorders may deteriorate drastically. Thus, automatic classification systems at home are desirable for people who are inaccessible to clinical disease assessments. However, the performance of such systems may be weakened owing to the constrained resources, and domain mismatch between the clinical data and noisy real-world data. Methods: This study develops a compact and domain-robust voice disorder classification system to identify the utterances of health, neoplasm, and benign structural diseases. Our proposed system utilizes a feature extractor model composed of factorized convolutional neural networks and subsequently deploys domain adversarial training to reconcile the domain mismatch by extracting domain-invariant features. Results: The results show that the unweighted average recall in the noisy real-world domain improved by 13% and remained at 80% in the clinic domain with only slight degradation. The domain mismatch was effectively eliminated. Moreover, the proposed system reduced the usage of both memory and computation by over 73.9%. Conclusion: By deploying factorized convolutional neural networks and domain adversarial training, domain-invariant features can be derived for voice disorder classification with limited resources. The promising results confirm that the proposed system can significantly reduce resource consumption and improve classification accuracy by considering the domain mismatch. Significance: To the best of our knowledge, this is the first study that jointly considers real-world model compression and noise-robustness issues in voice disorder classification. The proposed system is intended for application to embedded systems with limited resources.
翻译:目标: 语音障碍会大大降低个人在日常生活中说话的能力。 没有早期诊断和治疗,这些障碍可能会急剧恶化。 因此, 家庭自动分类系统对于无法进行临床疾病评估的人来说是可取的。 但是, 这些系统的性能可能会因为资源有限和临床数据与吵闹的现实世界数据之间的域错配而减弱。 方法 : 本研究开发了一个缩略式和域压碎式声音失常分类系统, 以确定健康、 肿瘤和良性结构疾病的发音能力。 我们提议的系统使用一个由因素化共振神经神经网络组成的特征提取模型, 并随后部署域间对称式培训, 以调和域间不匹配的功能。 结果: 结果显示, 噪音现实世界域的未加权平均恢复率提高了13%, 而在诊所域内, 仅略微退化了80%。 域错配法有效地消除了。 此外, 拟议的系统将记忆和计算的利用率减少了73.9%以上。 结论: 通过部署因素化共振动神经网络和域间对口对口对立式训练, 域间对立式训练进行域对立式训练, 机变准确性分析的结果特征特征特征特征特征特征特征分析, 将可大大地分析, 将资源定位系统改进为资源错误算。