Reusable data/code and reproducible analyses are foundational to quality research. This aspect, however, is often overlooked when designing interactive stream analysis workflows for time-series data (e.g., eye-tracking data). A mechanism to transmit informative metadata alongside data may allow such workflows to intelligently consume data, propagate metadata to downstream tasks, and thereby auto-generate reusable, reproducible analytic outputs with zero supervision. Moreover, a visual programming interface to design, develop, and execute such workflows may allow rapid prototyping for interdisciplinary research. Capitalizing on these ideas, we propose StreamingHub, a framework to build metadata propagating, interactive stream analysis workflows using visual programming. We conduct two case studies to evaluate the generalizability of our framework. Simultaneously, we use two heuristics to evaluate their computational fluidity and data growth. Results show that our framework generalizes to multiple tasks with a minimal performance overhead.
翻译:可重复使用的数据/编码和可复制的分析是高质量研究的基础,然而,在设计时间序列数据的互动流分析工作流程(例如,目视跟踪数据)时,往往忽略了这一方面。一个将信息性元数据与数据一起传输的机制,可能使这些工作流程能够明智地消耗数据,将元数据传播到下游任务,从而自动生成可重复使用、可复制分析产出,且零监督。此外,设计、开发和执行这些工作流程的视觉编程界面可以迅速为跨学科研究提供原型。利用这些想法,我们提议利用StraamingHub这一框架来建立元数据传播、互动式流分析工作流程;我们开展两个案例研究,评估我们框架的可概括性。同时,我们使用两个超常性数据来评价其计算性能和数据增长。结果显示,我们的框架可以概括成多项任务,而绩效管理则很少。