The coronavirus disease (COVID-19) has rapidly spread throughout the world and while pregnant women present the same adverse outcome rates, they are underrepresented in clinical research. We collected clinical data of 155 test-positive COVID-19 pregnant women at Stony Brook University Hospital. Many of these collected data are of multivariate categorical type, where the number of possible outcomes grows exponentially as the dimension of data increases. We modeled the data within the unsupervised Bayesian framework and mapped them into a lower-dimensional space using latent Gaussian processes. The latent features in the lower dimensional space were further used for predicting if a pregnant woman would be admitted to a hospital due to COVID-19 or would remain with mild symptoms. We compared the prediction accuracy with the dummy/one-hot encoding of categorical data and found that the latent Gaussian process had better accuracy.
翻译:科罗纳病毒(COVID-19)迅速蔓延到世界各地,孕妇的不良结果率相同,但她们在临床研究中的代表性不足。我们在Stony Brook大学医院收集了155个测试阳性COVID-19孕妇的临床数据。许多收集的数据是多变绝对型的,随着数据层面的增加,可能的结果数量会成倍增长。我们在未受监督的巴伊西亚框架内对数据进行了模拟,并利用潜伏高斯过程将数据绘制成一个低维空间。低维空间的潜伏特征进一步用于预测孕妇是否因COVID-19而被送入医院,或是否患有轻微症状。我们将预测的准确性与绝对数据的假/单热编码进行了比较,发现潜伏高斯过程的准确性更高。