We present VOICE, a novel approach for connecting large language models' (LLM) conversational capabilities with interactive exploratory visualization. VOICE introduces several innovative technical contributions that drive our conversational visualization framework. Our foundation is a pack-of-bots that can perform specific tasks, such as assigning tasks, extracting instructions, and generating coherent content. We employ fine-tuning and prompt engineering techniques to tailor bots' performance to their specific roles and accurately respond to user queries, and a new prompt-based iterative scene-tree generation establishes a coupling with a structural model. Our text-to-visualization method generates a flythrough sequence matching the content explanation. Finally, 3D natural language interaction provides capabilities to navigate and manipulate the 3D models in real-time. The VOICE framework can receive arbitrary voice commands from the user and responds verbally, tightly coupled with corresponding visual representation with low latency and high accuracy. We demonstrate the effectiveness and high generalizability potential of our approach by applying it to two distinct domains: analyzing three 3D molecular models with multi-scale and multi-instance attributes, and showcasing its effectiveness on a cartographic map visualization. A free copy of this paper and all supplemental materials are available at https://osf.io/g7fbr/.
翻译:我们提出了VOICE,一种连接大型语言模型(LLM)对话能力与交互式探索性可视化的新方法。 VOICE引入了几项创新技术贡献,推动了我们的会话可视化框架。我们的基础是一组机器人,可以执行特定的任务,比如分配任务,提取指令和生成连贯内容。我们使用微调和提示工程技术,将机器人的性能调整到它们各自的角色,并准确地响应用户查询,和一个新的基于提示的迭代场景树生成建立结构模型的耦合。我们的文本到可视化的方法生成与内容解释匹配的飞行序列。最后,3D自然语言交互提供了实时导航和操作3D模型的能力。VOICE框架可以接收用户的任意语音命令,并以低延迟和高准确度紧密地与相应的可视化表示相结合地回应。 我们通过将其应用于两个不同的领域来展示我们的方法的高效性和高通用性潜力:使用多尺度和多实例属性分析三个3D分子模型,并展示其在制图地图可视化上的有效性。此论文及所有补充材料的免费副本可在https://osf.io/g7fbr/获得。