没有数据共享的数据评估 (Data Appraisal Without Data Sharing) - 专知论文

会员服务 ·

0

Performer · MoDELS · 训练数据 · 泛函 · 计算学习理论 ·

2020 年 12 月 11 日

Data Appraisal Without Data Sharing

翻译：没有数据共享的数据评估

Mimee Xu,Laurens van der Maaten,Awni Hannun

from arxiv, Presented at NeurIPS Workshop for Privacy-Preserving Machine Learning Workshop (PPML 2020)

One of the most effective approaches to improving the performance of a machine-learning model is to acquire additional training data. To do so, a model owner may seek to acquire relevant training data from a data owner. Before procuring the data, the model owner needs to appraise the data. However, the data owner generally does not want to share the data until after an agreement is reached. The resulting Catch-22 prevents efficient data markets from forming. To address this problem, we develop data appraisal methods that do not require data sharing by using secure multi-party computation. Specifically, we study methods that: (1) compute parameter gradient norms, (2) perform model fine-tuning, and (3) compute influence functions. Our experiments show that influence functions provide an appealing trade-off between high-quality appraisal and required computation.

翻译：改善机器学习模式绩效的最有效方法之一是获取更多的培训数据。为此,模型所有人可以寻求从数据所有人那里获取相关的培训数据。在获取数据之前,模型所有人需要评估数据。然而,数据所有人一般不愿意在达成协议之前共享数据。由此产生的Catch-22防止了高效数据市场形成。为了解决这一问题,我们开发了数据评估方法,不需要通过安全的多方计算共享数据。具体地说,我们研究的方法有:(1) 计算参数梯度规范,(2) 进行模型微调,(3) 计算影响功能。我们的实验显示,影响功能在高质量评估和所需计算之间提供了一种有吸引力的权衡。

0

相关内容

Performer

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

数据科学导论，54页ppt，Introduction to Data Science

数据科学导论，54页ppt，Introduction to Data Science

专知会员服务

42+阅读 · 2020年7月27日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【CNN解释器】CNN EXPLAINER: Learning Convolutional Neural Networks with Interactive Visualization Zijie J. Wang, Robert Turko, Omar Shaikh, Haekyu Park, N

【CNN解释器】CNN EXPLAINER: Learning Convolutional Neural Networks with Interactive Visualization Zijie J. Wang, Robert Turko, Omar Shaikh, Haekyu Park, N

专知会员服务

34+阅读 · 2020年4月30日

【Google】微型化机器学习教程，17页ppt，Getting Started with TinyML

【Google】微型化机器学习教程，17页ppt，Getting Started with TinyML

专知会员服务

71+阅读 · 2020年3月28日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【推荐】Kaggle机器学习数据集推荐

【推荐】Kaggle机器学习数据集推荐

机器学习研究会

8+阅读 · 2017年11月19日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

ChipNet: Budget-Aware Pruning with Heaviside Continuous Approximations

Arxiv

0+阅读 · 2021年2月14日

Approximate Co-Sufficient Sampling for Goodness-of-fit Tests and Synthetic Data

Arxiv

0+阅读 · 2021年2月12日

Towards Practical Lipreading with Distilled and Efficient Models

Towards Practical Lipreading with Distilled and Efficient Models

Arxiv

0+阅读 · 2021年2月12日

Feature Selection for Huge Data via Minipatch Learning

Arxiv

0+阅读 · 2021年2月10日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

Multi-class Classification without Multi-class Labels

Multi-class Classification without Multi-class Labels

Arxiv

4+阅读 · 2019年1月2日

Unsupervised Meta-Learning for Reinforcement Learning

Arxiv

8+阅读 · 2018年6月12日

Joint Training for Neural Machine Translation Models with Monolingual Data

Arxiv

4+阅读 · 2018年3月1日

Word Translation Without Parallel Data

Arxiv

7+阅读 · 2018年1月30日

Learning to Speed Up Query Planning in Graph Databases

Arxiv

6+阅读 · 2018年1月21日

VIP会员

文章信息

相关主题

计算学习理论

相关VIP内容

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

数据科学导论，54页ppt，Introduction to Data Science

数据科学导论，54页ppt，Introduction to Data Science

专知会员服务

42+阅读 · 2020年7月27日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【CNN解释器】CNN EXPLAINER: Learning Convolutional Neural Networks with Interactive Visualization Zijie J. Wang, Robert Turko, Omar Shaikh, Haekyu Park, N

【CNN解释器】CNN EXPLAINER: Learning Convolutional Neural Networks with Interactive Visualization Zijie J. Wang, Robert Turko, Omar Shaikh, Haekyu Park, N

专知会员服务

34+阅读 · 2020年4月30日

【Google】微型化机器学习教程，17页ppt，Getting Started with TinyML

【Google】微型化机器学习教程，17页ppt，Getting Started with TinyML

专知会员服务

71+阅读 · 2020年3月28日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【推荐】Kaggle机器学习数据集推荐

【推荐】Kaggle机器学习数据集推荐

机器学习研究会

8+阅读 · 2017年11月19日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

ChipNet: Budget-Aware Pruning with Heaviside Continuous Approximations

Arxiv

0+阅读 · 2021年2月14日

Approximate Co-Sufficient Sampling for Goodness-of-fit Tests and Synthetic Data

Arxiv

0+阅读 · 2021年2月12日

Towards Practical Lipreading with Distilled and Efficient Models

Towards Practical Lipreading with Distilled and Efficient Models

Arxiv

0+阅读 · 2021年2月12日

Feature Selection for Huge Data via Minipatch Learning

Arxiv

0+阅读 · 2021年2月10日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

Multi-class Classification without Multi-class Labels

Multi-class Classification without Multi-class Labels

Arxiv

4+阅读 · 2019年1月2日

Unsupervised Meta-Learning for Reinforcement Learning

Arxiv

8+阅读 · 2018年6月12日

Joint Training for Neural Machine Translation Models with Monolingual Data

Arxiv

4+阅读 · 2018年3月1日

Word Translation Without Parallel Data

Arxiv

7+阅读 · 2018年1月30日

Learning to Speed Up Query Planning in Graph Databases

Arxiv

6+阅读 · 2018年1月21日

微信扫码咨询专知VIP会员