This study introduces a novel methodology for voice pathology detection using the publicly available Saarbr\"ucken Voice Database (SVD) database and a robust feature set combining commonly used acoustic handcrafted features with two novel ones: pitch difference (relative variation in fundamental frequency) and a NaN feature (failed fundamental frequency estimation). We evaluate six machine learning (ML) classifiers - support vector machine, k-nearest neighbors, naive Bayes, decision tree, random forest, and AdaBoost - using grid search for feasible hyperparameters of selected classifiers and 20480 different feature subsets. Top 1000 classifier-feature subset combinations for each classifier type are validated with repeated stratified cross-validation. To address class imbalance, we apply K-Means SMOTE to augment the training data. Our approach achieves outstanding performance, reaching 85.61%, 84.69% and 85.22% unweighted average recall (UAR) for females, males and combined results respectivelly. We intentionally omit accuracy as it is a highly biased metric for imbalanced data. This advancement demonstrates significant potential for clinical deployment of ML methods, offering a valuable supportive tool for an objective examination of voice pathologies. To enable an easier use of our methodology and to support our claims, we provide a publicly available GitHub repository with DOI 10.5281/zenodo.13771573. Finally, we provide a REFORMS checklist to enhance readability, reproducibility and justification of our approach.
翻译:暂无翻译