In light of the growing popularity of Exploratory Data Analysis (EDA), understanding the underlying causes of the knowledge acquired by EDA is crucial, but remains under-researched. This study promotes for the first time a transparent and explicable perspective on data analysis, called eXplainable Data Analysis (XDA). XDA provides data analysis with qualitative and quantitative explanations of causal and non-causal semantics. This way, XDA will significantly improve human understanding and confidence in the outcomes of data analysis, facilitating accurate data interpretation and decision-making in the real world. For this purpose, we present XInsight, a general framework for XDA. XInsight is a three-module, end-to-end pipeline designed to extract causal graphs, translate causal primitives into XDA semantics, and quantify the quantitative contribution of each explanation to a data fact. XInsight uses a set of design concepts and optimizations to address the inherent difficulties associated with integrating causality into XDA. Experiments on synthetic and real-world datasets as well as human evaluations demonstrate the highly promising capabilities of XInsight.
翻译:鉴于探索性数据分析(EDA)越来越受欢迎,了解EDA所获取知识的根本原因至关重要,但研究不足。本研究首次促进数据分析的透明和可推广的视角,称为可移植数据分析(XDA)。XDA提供数据分析,对因果关系和非因果关系进行定性和定量解释。这样,XDA将大大增进人类对数据分析结果的理解和信心,促进真实世界的准确数据解释和决策。为此目的,我们提出XDA的一般框架。XDA.XISight是一个三模块、端到端管道,旨在提取因果图,将因果原始体转化为XDA的语义,量化每项解释对数据事实的量化贡献。XDAHSight将使用一套设计概念和优化方法来解决与将因果关系纳入XDA有关的内在困难。合成和现实世界数据集实验以及人类评估显示XIISight的高度有希望的能力。