We present HelpViz, a tool for generating contextual visual mobile tutorials from text-based instructions that are abundant on the web. HelpViz transforms text instructions to graphical tutorials in batch, by extracting a sequence of actions from each text instruction through an instruction parsing model, and executing the extracted actions on a simulation infrastructure that manages an array of Android emulators. The automatic execution of each instruction produces a set of graphical and structural assets, including images, videos, and metadata such as clicked elements for each step. HelpViz then synthesizes a tutorial by combining parsed text instructions with the generated assets, and contextualizes the tutorial to user interaction by tracking the user's progress and highlighting the next step. Our experiments with HelpViz indicate that our pipeline improved tutorial execution robustness and that participants preferred tutorials generated by HelpViz over text-based instructions. HelpViz promises a cost-effective approach for generating contextual visual tutorials for mobile interaction at scale.
翻译:我们展示了“HelpViz ”, 这是一种从基于文本的指令中产生背景视觉移动教程的工具,它来自网络上丰富的内容。“HelpViz ” 将文字教程转换成批量图形教程。“HelpViz ” 将文字教程转换成图形教程,通过一个指令解析模型从每个文本教程中抽出一系列行动,并在管理一系列Android模拟器的模拟基础设施中执行提取行动。“HelpViz” 自动执行每项教程产生一套图形和结构资产,包括图像、视频和每个步骤的点击元素等元数据。“HelpViz” 则通过将分析文本教程与生成的资产结合起来,将教程与用户的互动环境化为背景,跟踪用户的进展并突出下一个步骤。我们与“HelpViz”的实验表明,我们的管道的教程更加稳健,与会者更喜欢“HelpViz” 的教程而不是基于文字的指示。“HelpViz” 承诺一种成本有效的方法, 来产生用于大规模移动互动的背景视觉教程的背景教程。