Data science workflows are human-centered processes involving on-demand programming and analysis. While programmable and interactive interfaces such as widgets embedded within computational notebooks are suitable for these workflows, they lack robust state management capabilities and do not support user-defined customization of the interactive components. The absence of such capabilities hinders workflow reusability and transparency while limiting the scope of exploration of the end-users. In response, we developed MAGNETON, a framework for authoring interactive widgets within computational notebooks that enables transparent, reusable, and customizable data science workflows. The framework enhances existing widgets to support fine-grained interaction history management, reusable states, and user-defined customizations. We conducted three case studies in a real-world knowledge graph construction and serving platform to evaluate the effectiveness of these widgets. Based on the observations, we discuss future implications of employing MAGNETON widgets for general-purpose data science workflows.
翻译:数据科学工作流是以人为中心的、按需编程和分析的过程。虽然嵌入在计算笔记本中的可编程和交互式界面(如微件)适用于这些工作流,但它们缺乏强大的状态管理功能,并且不支持用户定义的交互组件定制。缺乏这些能力会阻碍工作流程的重用和透明度,同时限制最终用户的探索范围。为此,我们开发了MAGNETON框架,用于在计算笔记本中编写交互式微件,以实现数据科学工作流程的透明、可重用和可定制性。该框架增强了现有微件的功能,支持精细的交互历史管理、可重用状态和用户定义的定制。我们在真实的知识图构建和服务平台中进行了三个案例研究,以评估这些微件的有效性。基于观察结果,我们讨论了采用MAGNETON微件用于通用数据科学工作流程的未来影响。