Visualization plays an important role in analyzing and exploring time series data. To facilitate efficient visualization of large datasets, downsampling has emerged as a well-established approach. This work concentrates on LTTB (Largest-Triangle-Three-Buckets), a widely adopted downsampling algorithm for time series data point selection. Specifically, we propose MinMaxLTTB, a two-step algorithm that marks a significant enhancement in the scalability of LTTB. MinMaxLTTB entails the following two steps: (i) the MinMax algorithm preselects a certain ratio of minimum and maximum data points, followed by (ii) applying the LTTB algorithm on only these preselected data points, effectively reducing LTTB's time complexity. The low computational cost of the MinMax algorithm, along with its parallelization capabilities, facilitates efficient preselection of data points. Additionally, the competitive performance of MinMax in terms of visual representativeness also makes it an effective reduction method. Experiments show that MinMaxLTTB outperforms LTTB by more than an order of magnitude in terms of computation time. Furthermore, preselecting a small multiple of the desired output size already provides similar visual representativeness compared to LTTB. In summary, MinMaxLTTB leverages the computational efficiency of MinMax to scale LTTB, without compromising on LTTB's favored visualization properties. The accompanying code and experiments of this paper can be found at https://github.com/predict-idlab/MinMaxLTTB.
翻译:可视化在分析和探索时间序列数据方面起着重要作用。为了促进大型数据集的高效可视化,降采样已成为一种成熟的方法。本文关注LTTB(最大三角形三个桶)算法,这是一种广泛采用的时间序列数据点选择降采样算法。具体而言,我们提出了MinMaxLTTB,这是一个两步算法,显著提高了LTTB的可扩展性。MinMaxLTTB包括以下两个步骤:(i)MinMax算法预选出一定比例的最小值和最大值数据点,然后(ii)仅对这些预选数据点应用LTTB算法,从而有效降低了LTTB的时间复杂度。MinMax算法的低计算成本以及其并行化能力有助于有效地预选数据点。此外,MinMax在视觉表示方面的竞争性能使其成为一种有效的降采样方法。实验表明,与LTTB相比,MinMaxLTTB在计算时间方面的性能提升超过一个数量级。此外,预选所需输出大小的少量倍数即可提供与LTTB相似的视觉表示度。总之,MinMaxLTTB利用MinMax的计算效率来扩展LTTB,而不影响LTTB的喜欢的可视化性质。本文的代码和实验可以在https://github.com/predict-idlab/MinMaxLTTB找到。