As the rate of data collection continues to grow rapidly, developing visualization tools that scale to immense data sets is a serious and ever-increasing challenge. Existing approaches generally seek to decouple storage and visualization systems, performing just-in-time data reduction to transparently avoid overloading the visualizer. We present a new architecture in which the visualizer and data store are tightly coupled. Unlike systems that read raw data from storage, the performance of our system scales linearly with the size of the final visualization, essentially independent of the size of the data. Thus, it scales to massive data sets while supporting interactive performance (sub-100 ms query latency). This enables a new class of visualization clients that automatically manage data, quickly and transparently requesting data from the underlying database without requiring the user to explicitly initiate queries. It lays a groundwork for supporting truly interactive exploration of big data and opens new directions for research on scalable information visualization systems.
翻译:随着数据收集速度的继续快速增长,开发可视化工具,将其规模扩大到庞大的数据集,是一项严重和日益严峻的挑战。现有方法通常寻求使存储和可视化系统脱钩,进行即时数据缩减,以透明地避免视觉化器超载。我们提出了一个新的结构,使可视化器和数据储存能够密切结合。与从储存中读取原始数据的系统不同,我们的系统规模与最终可视化规模不同,基本上独立于数据大小。因此,在支持互动性能(次100毫秒查询延时)的同时,系统规模可视化为大数据集。这可以使新的可视化客户类别能够自动管理数据,快速和透明地从基本数据库请求数据,而无需用户明确启动查询。它为支持真正互动探索大数据并为可扩展信息可视化系统的研究开辟新的方向奠定了基础。