Today, data collection has improved in various areas, and the medical domain is no exception. Auscultation, as an important diagnostic technique for physicians, due to the progress and availability of digital stethoscopes, lends itself well to applications of machine learning. Due to the large number of auscultations performed, the availability of data opens up an opportunity for more effective analysis of sounds where prognostic accuracy even among experts remains low. In this study, digital 6-channel auscultations of 45 patients were used in various machine learning scenarios, with the aim of distinguishing between normal and anomalous pulmonary sounds. Audio features (such as fundamental frequencies F0-4, loudness, HNR, DFA, as well as descriptive statistics of log energy, RMS and MFCC) were extracted using the Python library Surfboard. Windowing and feature aggregation and concatenation strategies were used to prepare data for tree-based ensemble models in unsupervised (fair-cut forest) and supervised (random forest) machine learning settings. The evaluation was carried out using 9-fold stratified cross-validation repeated 30 times. Decision fusion by averaging outputs for a subject was tested and found to be useful. Supervised models showed a consistent advantage over unsupervised ones, achieving mean AUC ROC of 0.691 (accuracy 71.11%, Kappa 0.416, F1-score 0.771) in side-based detection and mean AUC ROC of 0.721 (accuracy 68.89%, Kappa 0.371, F1-score 0.650) in patient-based detection.
翻译:由于数字听诊器的进步和可用性,作为医生的重要诊断技术的技艺,它非常适合于机器学习的应用。由于进行了大量的演练,数据的提供为更有效地分析声音提供了机会,即使在专家中,预测性准确性也很低。在这项研究中,在各种机器学习情景中使用了45名病人的六道数字式六道助听器,目的是区分正常和异常肺部声音。音频特征(如基本频率F0-4、高音、高频、高频、高频)以及记录能量、RMS和MFCC的描述性统计数据,利用Python图书馆苏菲板进行了大量的演练,为更有效地分析声音提供了机会。在这项研究中,利用基于树基的混合模型(基于精密的森林)和监管(随机森林),目的是区分正常和异常的肺部声音。评价使用了9倍的F0-4、高频、高压的RMS 和MFC 连续30年的递增产值。在不连续的ROA-C中,通过连续测试了正常的OLA-RO-R-ral-ral-ral-ral-ral-ral-ral-ral-ral-ral-ral-ral-ral-l-ral-l-ral-ral-ral-ral-ral-ral-ral-l-ral-ral-l-ral-l-ral-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l-l