With the ongoing COVID-19 pandemic, understanding the characteristics of the virus has become an important and challenging task in the scientific community. While tests do exist for COVID-19, the goal of our research is to explore other methods of identifying infected individuals. Our group applied unsupervised clustering techniques to explore a dataset of lungscans of COVID-19 infected, Viral Pneumonia infected, and healthy individuals. This is an important area to explore as COVID-19 is a novel disease that is currently being studied in detail. Our methodology explores the potential that unsupervised clustering algorithms have to reveal important hidden differences between COVID-19 and other respiratory illnesses. Our experiments use: Principal Component Analysis (PCA), K-Means++ (KM++) and the recently developed Robust Continuous Clustering algorithm (RCC). We evaluate the performance of KM++ and RCC in clustering COVID-19 lung scans using the Adjusted Mutual Information (AMI) score.
翻译:随着COVID-19大流行,了解病毒特征已成为科学界一项重要和具有挑战性的任务。虽然COVID-19确实存在测试,但我们的研究目标是探索其他识别受感染个人的方法。我们的小组运用未经监督的集群技术,对受COVID-19感染的肺癌、病毒性肺炎感染和健康个人的数据集进行探索。这是一个重要探索领域,因为COVID-19是一种新疾病,目前正在进行详细研究。我们的方法探索了未经监督的集群算法揭示COVID-19与其他呼吸道疾病之间重要隐蔽差异的可能性。我们的实验用途:主要组成部分分析(PCA)、K-Means++(KM++)和最近开发的机械连续组合算(RCC),我们用调整的相互信息评分来评估K+和RCC在将COVID-19肺扫描组合的绩效。