Data workers use various scripting languages for data transformation, such as SAS, R, and Python. However, understanding intricate code pieces requires advanced programming skills, which hinders data workers from grasping the idea of data transformation at ease. Program visualization is beneficial for debugging and education and has the potential to illustrate transformations intuitively and interactively. In this paper, we explore visualization design for demonstrating the semantics of code pieces in the context of data transformation. First, to depict individual data transformations, we structure a design space by two primary dimensions, i.e., key parameters to encode and possible visual channels to be mapped. Then, we derive a collection of 23 glyphs that visualize the semantics of transformations. Next, we design a pipeline, named Somnus, that provides an overview of the creation and evolution of data tables using a provenance graph. At the same time, it allows detailed investigation of individual transformations. User feedback on Somnus is positive. Our study participants achieved better accuracy with less time using Somnus, and preferred it over carefully-crafted textual description. Further, we provide two example applications to demonstrate the utility and versatility of Somnus.
翻译:然而,理解复杂的代码元素需要先进的编程技能,这会妨碍数据工作者轻松地掌握数据转换的理念。程序可视化有助于调试和教育,并有可能以直观和互动的方式说明转换过程。在本文中,我们探索可视化设计,以在数据转换过程中展示代码元件的语义。首先,为了描述个人数据转换,我们用两个主要维度来构建设计设计空间,即编码关键参数和可能要绘制的视觉通道。然后,我们收集了23个可视化转换过程的语义。接下来,我们设计了一个名为Somnus的管道,用一个引文图来概述数据表格的创建和演变过程。同时,它允许详细调查个人转换过程。用户对Somnus的反馈是积极的。我们的研究参与者在使用Somnus的更短时间里实现了更准确的准确性,并且更倾向于使用它而不是精心设计的文本描述。我们提供了两种工具的多面图。我们提供了两个例子,用来演示多面性。