Information visualizations such as bar charts and line charts are very popular for exploring data and communicating insights. Interpreting and making sense of such visualizations can be challenging for some people, such as those who are visually impaired or have low visualization literacy. In this work, we introduce a new dataset and present a neural model for automatically generating natural language summaries for charts. The generated summaries provide an interpretation of the chart and convey the key insights found within that chart. Our neural model is developed by extending the state-of-the-art model for the data-to-text generation task, which utilizes a transformer-based encoder-decoder architecture. We found that our approach outperforms the base model on a content selection metric by a wide margin (55.42% vs. 8.49%) and generates more informative, concise, and coherent summaries.
翻译:条形图和直线图等信息直观化对于探索数据和交流洞察力非常流行。解释和理解这种直观化对于某些人来说可能具有挑战性,例如视力受损或视觉化程度低的人。在这项工作中,我们引入了一个新的数据集,并为自动生成图表的自然语言摘要提供了神经模型。生成的概要提供了对图表的解释,并传达了该图表中发现的关键洞察力。我们的神经模型是通过扩大数据到文字生成的最先进的模型来开发的,该模型使用基于变压器的编码器-解码器结构。我们发现,我们的方法比基于一个宽边距(55.42%对8.49%)的内容选择标准的基础模型(55.42%对8.49%)要优于基础模型,并产生更丰富、简洁和连贯的概要。