A paper of Alsinglawi et al was recently accepted and published in Scientific Reports. In this paper, the authors aim to predict length of stay (LOS), discretized into either long (> 7 days) or short stays (< 7 days), of lung cancer patients in an ICU department using various machine learning techniques. The authors claim to achieve perfect results with an Area Under the Receiver Operating Characteristic curve (AUROC) of 100% with a Random Forest (RF) classifier with ADASYN class balancing over sampling technique, which if accurate could have significant implications for hospital management. However, we have identified several methodological flaws within the manuscript which cause the results to be overly optimistic and would have serious consequences if used in a clinical practice. Moreover, the reporting of the methodology is unclear and many important details are missing from the manuscript, which makes reproduction extremely difficult. We highlight the effect these oversights have had on the result and provide a more believable result of 88.91% AUROC when these oversights are corrected.
翻译:Alsinglawi等人等人的论文最近被接受,并在科学报告中发表,作者们在本文中旨在预测使用各种机器学习技术在伊斯兰法院联盟的一个部门内隔离为长(7天以上)或短(7天以下)的肺癌病人的停留时间(LOS),作者们声称,在接受者操作特征曲线(AUROC)下100 %的地区,在随机森林分类和ADASYN等级之间平衡取样技术(如果准确对医院管理有重大影响的话)取得完美的结果,然而,我们发现手稿中存在一些方法上的缺陷,导致结果过于乐观,如果在临床实践中使用,将产生严重后果;此外,该方法的报告不清楚,手稿中缺少许多重要细节,使生殖极为困难;我们强调这些监督对结果的影响,并在纠正这些监督时提供88.91%的可辩驳结果。