The novel coronavirus (SARS-CoV-2) which causes COVID-19 is an ongoing pandemic. There are ongoing studies with up to hundreds of publications uploaded to databases daily. We are exploring the use-case of artificial intelligence and natural language processing in order to efficiently sort through these publications. We demonstrate that clinical trial information, preclinical studies, and a general topic model can be used as text mining data intelligence tools for scientists all over the world to use as a resource for their own research. To evaluate our method, several metrics are used to measure the information extraction and clustering results. In addition, we demonstrate that our workflow not only have a use-case for COVID-19, but for other disease areas as well. Overall, our system aims to allow scientists to more efficiently research coronavirus. Our automatically updating modules are available on our information portal at https://ghddi-ailab.github.io/Targeting2019-nCoV/ for public viewing.
翻译:导致COVID-19的新型科罗纳病毒(SARS-COV-2)是一种持续流行的流行病,目前正在对每天上传到数据库的多达数百种出版物进行研究,我们正在探索人工智能和自然语言处理的使用情况,以便有效地通过这些出版物进行分类。我们证明,临床试验信息、临床预科研究和一般主题模型可以用作全世界的科学家的文本采矿数据情报工具,以便用作他们自己的研究资源。为了评估我们的方法,使用了若干指标来衡量信息提取和集束结果。此外,我们还表明,我们的工作流程不仅对COVID-19有用,而且对其他疾病领域也是如此。总体而言,我们的系统旨在使科学家能够更有效地研究科罗纳病毒。我们的自动更新模块可在我们的信息门户https://ghddi-ilalab.github.io/Targeting2019-nCOV/供公众查看。