Keeping abreast of current trends, technologies, and best practices in visualization and data analysis is becoming increasingly difficult, especially for fledgling data scientists. In this paper, we propose Lodestar, an interactive computational notebook that allows users to quickly explore and construct new data science workflows by selecting from a list of automated analysis recommendations. We derive our recommendations from directed graphs of known analysis states, with two input sources: one manually curated from online data science tutorials, and another extracted through semi-automatic analysis of a corpus of over 6,000 Jupyter notebooks. We evaluate Lodestar in a formative study guiding our next set of improvements to the tool. Our results suggest that users find Lodestar useful for rapidly creating data science workflows.
翻译:保持可视化和数据分析方面的当前趋势、技术和最佳实践正变得越来越困难,特别是对于新兴的数据科学家来说。在本文中,我们提议使用一个互动计算笔记本Lodestar,让用户能够从自动分析建议列表中选择快速探索和构建新的数据科学工作流程。我们的建议来自已知分析状态的定向图表,有两个输入来源:一个来自在线数据科学辅导,另一个来自在线数据科学辅导,另一个来自对6,000多本Jupyter笔记本的半自动分析。我们在指导我们下一个工具改进的成型研究中评估Lodestar。我们的结果表明,用户发现Lodestar对快速创建数据科学工作流程有用。