Complex, high-dimensional data is used in a wide range of domains to explore problems and make decisions. Analysis of high-dimensional data, however, is vulnerable to the hidden influence of confounding variables, especially as users apply ad hoc filtering operations to visualize only specific subsets of an entire dataset. Thus, visual data-driven analysis can mislead users and encourage mistaken assumptions about causality or the strength of relationships between features. This work introduces a novel visual approach designed to reveal the presence of confounding variables via counterfactual possibilities during visual data analysis. It is implemented in CoFact, an interactive visualization prototype that determines and visualizes \textit{counterfactual subsets} to better support user exploration of feature relationships. Using publicly available datasets, we conducted a controlled user study to demonstrate the effectiveness of our approach; the results indicate that users exposed to counterfactual visualizations formed more careful judgments about feature-to-outcome relationships.
翻译:复杂、高维的数据用于广泛的领域,以探索问题和作出决定。但是,对高维数据的分析容易受到混杂变量的隐蔽影响,特别是用户采用临时过滤操作,仅将整个数据集的特定子集可视化。因此,视觉数据驱动的分析可以误导用户,鼓励错误地假设各功能之间的因果关系或关系强度。这项工作引入了一种新颖的视觉方法,旨在通过视觉数据分析中的反事实可能性来显示混杂变量的存在。它应用在CoFact中,这是一个互动的可视化原型,用以确定和可视化 \ textit{counterfact 子集},以更好地支持用户探索地物关系。我们利用公开的数据集,进行了一项受控用户研究,以证明我们的方法的有效性;结果显示,接触反事实视觉的用户对地对结果关系的判断更为仔细。