Radiology report summarization is a growing area of research. Given the Findings and/or Background sections of a radiology report, the goal is to generate a summary (called an Impression section) that highlights the key observations and conclusions of the radiology study. Recent efforts have released systems that achieve promising performance as measured by widely used summarization metrics such as BLEU and ROUGE. However, the research area of radiology report summarization currently faces important limitations. First, most of the results are reported on private datasets. This limitation prevents the ability to reproduce results and fairly compare different systems and solutions. Secondly, to the best of our knowledge, most research is carried out on chest X-rays. Sometimes, studies even omit to mention the concerned modality and anatomy in the radiology reports used for their experiments. To palliate these limitations, we propose a new dataset of six different modalities and anatomies based on the MIMIC-III database. We further release our results and the data splits used to carry out our experiments. Finally, we propose a simple report summarization system that outperforms the previous replicable research on the existing dataset.
翻译:放射学报告总结是一个日益扩大的研究领域。鉴于放射学报告的结论和/或背景部分,目标是编制一份摘要(称为 " 压缩部分 " ),突出放射学研究的主要观察和结论。最近的努力释放出一些系统,以广泛使用的概括性测量标准,如BLEU和ROUGE, 取得有希望的性能。然而,放射学报告总结性研究领域目前面临重大限制。首先,大部分结果都是在私人数据集中报告。这一限制妨碍了复制结果和比较不同系统和解决办法的能力。第二,根据我们的知识,大多数研究都是在胸部X射线上进行。有时,研究甚至忽略了在用于实验的放射学报告中提及有关模式和解剖。为了消除这些限制,我们建议根据MIMIC-III数据库建立一个由六种不同模式和解剖组成的新数据集。我们进一步公布我们的成果和用于进行实验的数据分裂。最后,我们建议建立一个简单的总结性报告系统,以取代先前对现有数据进行重塑。