We present "PATRED", a technique that uses the addition of redundant information to facilitate the detection of specific, generally described patterns in line-charts during the visual exploration of the charts. We compared different versions of this technique, that differed in the way redundancy was added, using nine distance metrics (such as Euclidean, Pearson, Mutual Information and Jaccard) with judgments from data scientists which served as the "ground truth". Results were analyzed with correlations (R2), F1 scores and Mutual Information with the average ranking by the data scientists. Some distance metrics consistently benefit from the addition of redundant information, while others are only enhanced for specific types of data perturbations. The results demonstrate the value of adding redundancy to improve the identification of patterns in time-series data during visual exploration.
翻译:我们用“PATRED”这一技术来提供“PATRED”,这种技术利用增加冗余性,利用数据科学家的“地面真相”的判断,用“PATRED”这一技术来帮助探测在直观勘探海图时线图中特定、一般描述的模式。我们比较了这种技术的不同版本,这种技术在增加冗余性的方式上有所不同,使用9个距离测量仪(如Euclidean、Pearson、相互信息和Jacccard),使用数据科学家的判断作为“地面真相”。对结果与数据科学家平均排名的相互关系(R2)、F1评分和相互信息进行了分析。一些距离测量标准从添加冗余信息中不断受益,而另一些则仅对特定类型的数据扰动性进行强化。结果表明,增加冗余性的价值在于在视觉探索期间改进时间序列数据模式的识别。