Natural language interfaces (NLIs) have shown great promise for visual data analysis, allowing people to flexibly specify and interact with visualizations. However, developing visualization NLIs remains a challenging task, requiring low-level implementation of natural language processing (NLP) techniques as well as knowledge of visual analytic tasks and visualization design. We present NL4DV, a toolkit for natural language-driven data visualization. NL4DV is a Python package that takes as input a tabular dataset and a natural language query about that dataset. In response, the toolkit returns an analytic specification modeled as a JSON object containing data attributes, analytic tasks, and a list of Vega-Lite specifications relevant to the input query. In doing so, NL4DV aids visualization developers who may not have a background in NLP, enabling them to create new visualization NLIs or incorporate natural language input within their existing systems. We demonstrate NL4DV's usage and capabilities through four examples: 1) rendering visualizations using natural language in a Jupyter notebook, 2) developing a NLI to specify and edit Vega-Lite charts, 3) recreating data ambiguity widgets from the DataTone system, and 4) incorporating speech input to create a multimodal visualization system.
翻译:自然语言界面(NLIS)显示了视觉数据分析的巨大前景,使人们能够灵活地指定和与可视化互动。然而,开发可视化国家语言界面(NLIS)仍是一项艰巨的任务,需要低层次地实施自然语言处理(NLP)技术,以及视觉分析任务和可视化设计的知识。我们介绍了自然语言驱动数据可视化工具包NL4DV。NL4DV是一个Python软件包,用于输入表格数据集和关于该数据集的自然语言查询。作为回应,工具包返回一种分析性规格,作为JSON的模型,包含数据属性、分析性任务和与输入查询相关的Vega-Lite规格清单。为此,我们提供了NL4DA辅助视觉开发者,这些开发者可能没有NLP的自然语言驱动数据可视化工具,或将自然语言输入到其现有系统中。我们通过四个示例展示了NL4DVV的使用情况和能力:1)在Jupyter笔记本上使用自然语言进行可视化的可视化处理,从SVIAVIA-stimalimalimstationalizations 4 viewdal 4) viewd disold 和制成一个可视化系统。