The trend of deploying digital systems in numerous industries has induced a hike in recording digital information. The health sector has observed an extensive adoption of digital devices and systems that generate large volumes of personal medical records. Electronic health records contain valuable information for retrospective and prospective analysis that is often not entirely exploited because of the dense information storage. The crude purpose of condensing health records is to select the information that holds most characteristics of the original documents based on reported disease. These summaries may boost diagnosis and extend a doctor's time with the patient during a high workload situation like the COVID-19 pandemic. In this paper, we propose applying a multi-head attention-based mechanism to perform extractive summarization of meaningful phrases in clinical notes. This method finds major sentences for a summary by correlating tokens, segments, and positional embeddings. The model outputs attention scores that are statistically transformed to extract key phrases and can be used to projection on the heat-mapping tool for visual and human use.
翻译:在许多行业部署数字系统的趋势导致数字信息记录增加; 卫生部门观察到广泛采用数字装置和系统,产生大量个人医疗记录; 电子健康记录载有宝贵的回溯和预期分析信息,由于信息储存密度大,往往没有完全加以利用; 压缩健康记录粗略的目的是根据报告的疾病选择保留原始文件大部分特征的信息; 这些摘要可能会在诸如COVID-19大流行这样工作量大的情况下,增加诊断,延长医生与病人的时间; 本文建议采用多头关注机制,对临床说明中有意义的短语进行采掘式总结; 这种方法通过相关的符号、区段和定位嵌入,为摘要找到主要句子; 模型产出显示的分数在统计上作了改变,以提取关键短语,并可用于预测用于视觉和人类使用的热映工具。