解释使用解释-Da-V(技术报告)版本的语义数据解释数据集变化(技术报告) (Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V (Technical Report)) - 专知论文

会员服务 ·

0

数据集 · AIM · 数据转换 · Analysis · Better ·

2023 年 1 月 30 日

Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V (Technical Report)

翻译：解释使用解释-Da-V(技术报告)版本的语义数据解释数据集变化(技术报告)

Roee Shraga,Renée J. Miller

from arxiv, To appear in VLDB 2023

In multi-user environments in which data science and analysis is collaborative, multiple versions of the same datasets are generated. While managing and storing data versions has received some attention in the research literature, the semantic nature of such changes has remained under-explored. In this work, we introduce \texttt{Explain-Da-V}, a framework aiming to explain changes between two given dataset versions. \texttt{Explain-Da-V} generates \emph{explanations} that use \emph{data transformations} to explain changes. We further introduce a set of measures that evaluate the validity, generalizability, and explainability of these explanations. We empirically show, using an adapted existing benchmark and a newly created benchmark, that \texttt{Explain-Da-V} generates better explanations than existing data transformation synthesis methods.

翻译：在数据研究和分析具有协作性的多用户环境中,生成了多种版本的同一数据集。虽然管理和存储数据版本在研究文献中受到了一些关注,但这种变化的语义性质仍然没有得到充分探讨。在这项工作中,我们引入了\textt{Explain-Da-V},这个框架旨在解释两个特定数据集版本之间的变化。\textt{Extrain-Da-V}生成了\emph{explanation},它使用\emph{data transform}来解释变化。我们进一步引入了一套评估这些解释的有效性、可概括性和可解释性的措施。我们从经验上表明,使用经调整的现有基准和新创建的基准,\textt{Explain-Da-V} 产生比现有数据转换合成方法更好的解释。

0

相关内容

数据集

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Sema4D在肥胖诱导的脂肪炎症和胰岛素抵抗中的作用和机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

剪应力响应的二氢杨梅素Pickering乳液用于治疗动脉粥样硬化的研究

国家自然科学基金

0+阅读 · 2015年12月31日

ApoA1/ABCA1抗动脉粥样硬化新机制-自噬介导的血管外周脂肪组织抗炎途径

国家自然科学基金

0+阅读 · 2014年12月31日

茶树病原真菌和内生真菌的多样性及其分布规律

国家自然科学基金

0+阅读 · 2013年12月31日

SIRTs/PPARs/FABP3在氧化苦参碱改善骨骼肌胰岛素抵抗中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

Skp2-p27信号通路在卵巢早衰发病中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

mir-125b在1型糖尿病自身免疫性胰岛炎中的作用及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Erbin在细胞分裂周期中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Bounding System-Induced Biases in Recommender Systems with A Randomized Dataset

Arxiv

0+阅读 · 2023年3月21日

Evaluating Inclusivity, Equity, and Accessibility of NLP Technology: A Case Study for Indian Languages

Arxiv

0+阅读 · 2023年3月21日

Solving High-Dimensional Inverse Problems with Auxiliary Uncertainty via Operator Learning with Limited Data

Arxiv

0+阅读 · 2023年3月20日

What Makes Sentences Semantically Related: A Textual Relatedness Dataset and Empirical Study

Arxiv

0+阅读 · 2023年3月20日

Unsupervised Cross-Domain Rumor Detection with Contrastive Learning and Cross-Attention

Arxiv

0+阅读 · 2023年3月20日

Prototype Knowledge Distillation for Medical Segmentation with Missing Modality

Arxiv

0+阅读 · 2023年3月17日

Semi-supervised Bladder Tissue Classification in Multi-Domain Endoscopic Images

Arxiv

0+阅读 · 2023年3月17日

Explainable Deep Learning Methods in Medical Diagnosis: A Survey

Arxiv

35+阅读 · 2022年5月10日

A Survey of the State of Explainable AI for Natural Language Processing

Arxiv

26+阅读 · 2020年10月1日

Explainable Reasoning over Knowledge Graphs for Recommendation

Arxiv

11+阅读 · 2018年11月12日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】迈向鲁棒的零样本强化学习

一种基于视觉算法生成三维场景重建的多任务系统 | 2025最新200页

【普林斯顿博士论文】量化、评估与缓解现代机器学习系统中的风险

遥感中基于深度学习的领域自适应方法：全面综述

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Bounding System-Induced Biases in Recommender Systems with A Randomized Dataset

Arxiv

0+阅读 · 2023年3月21日

Evaluating Inclusivity, Equity, and Accessibility of NLP Technology: A Case Study for Indian Languages

Arxiv

0+阅读 · 2023年3月21日

Solving High-Dimensional Inverse Problems with Auxiliary Uncertainty via Operator Learning with Limited Data

Arxiv

0+阅读 · 2023年3月20日

What Makes Sentences Semantically Related: A Textual Relatedness Dataset and Empirical Study

Arxiv

0+阅读 · 2023年3月20日

Unsupervised Cross-Domain Rumor Detection with Contrastive Learning and Cross-Attention

Arxiv

0+阅读 · 2023年3月20日

Prototype Knowledge Distillation for Medical Segmentation with Missing Modality

Arxiv

0+阅读 · 2023年3月17日

Semi-supervised Bladder Tissue Classification in Multi-Domain Endoscopic Images

Arxiv

0+阅读 · 2023年3月17日

Explainable Deep Learning Methods in Medical Diagnosis: A Survey

Arxiv

35+阅读 · 2022年5月10日

A Survey of the State of Explainable AI for Natural Language Processing

Arxiv

26+阅读 · 2020年10月1日

Explainable Reasoning over Knowledge Graphs for Recommendation

Arxiv

11+阅读 · 2018年11月12日

相关基金

Sema4D在肥胖诱导的脂肪炎症和胰岛素抵抗中的作用和机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

剪应力响应的二氢杨梅素Pickering乳液用于治疗动脉粥样硬化的研究

国家自然科学基金

0+阅读 · 2015年12月31日

ApoA1/ABCA1抗动脉粥样硬化新机制-自噬介导的血管外周脂肪组织抗炎途径

国家自然科学基金

0+阅读 · 2014年12月31日

茶树病原真菌和内生真菌的多样性及其分布规律

国家自然科学基金

0+阅读 · 2013年12月31日

SIRTs/PPARs/FABP3在氧化苦参碱改善骨骼肌胰岛素抵抗中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

Skp2-p27信号通路在卵巢早衰发病中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

mir-125b在1型糖尿病自身免疫性胰岛炎中的作用及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Erbin在细胞分裂周期中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员