存储高效接近影响函数的深神经网络数据清理 (Data Cleansing for Deep Neural Networks with Storage-efficient Approximation of Influence Functions) - 专知论文

会员服务 ·

0

可约的 · Neural Networks · 模型评估 · 轮 · Networking ·

2021 年 3 月 22 日

Data Cleansing for Deep Neural Networks with Storage-efficient Approximation of Influence Functions

翻译：存储高效接近影响函数的深神经网络数据清理

Kenji Suzuki,Yoshiyuki Kobayashi,Takuya Narihira

Identifying the influence of training data for data cleansing can improve the accuracy of deep learning. An approach with stochastic gradient descent (SGD) called SGD-influence to calculate the influence scores was proposed, but, the calculation costs are expensive. It is necessary to temporally store the parameters of the model during training phase for inference phase to calculate influence sores. In close connection with the previous method, we propose a method to reduce cache files to store the parameters in training phase for calculating inference score. We only adopt the final parameters in last epoch for influence functions calculation. In our experiments on classification, the cache size of training using MNIST dataset with our approach is 1.236 MB. On the other hand, the previous method used cache size of 1.932 GB in last epoch. It means that cache size has been reduced to 1/1,563. We also observed the accuracy improvement by data cleansing with removal of negatively influential data using our approach as well as the previous method. Moreover, our simple and general proposed method to calculate influence scores is available on our auto ML tool without programing, Neural Network Console. The source code is also available.

翻译：确定数据清理培训数据的影响可以提高深层学习的准确性。提出了一个叫做 SGD- impact( SGD- impact) 以计算影响分数的方法, 但计算成本非常昂贵。有必要将模型参数暂时存储在测试阶段的培训阶段, 以计算疼痛影响。与前一种方法密切关联, 我们建议了一种方法, 减少缓存文件, 以存储用于计算推算分的培训阶段的参数。我们只是在最后的时段采用最后的参数来计算影响函数。在我们的分类实验中, 使用 MNIST 数据集进行的培训的缓存大小为 1.236 MB 。另一方面, 上次的缓存规模为 1. 932 GB 。这意味着缓存规模已经缩小到 1/1 563 。我们还观察到通过数据清理来改进准确性, 通过使用我们的方法和前一种方法删除有负影响的数据。此外, 我们用于计算影响分数的简单和一般的拟议方法, 可以在不编程的自动 ML 工具 Neural 网络中找到源码。

0

相关内容

可约的

最新【深度生成模型】Deep Generative Models，104页ppt

最新【深度生成模型】Deep Generative Models，104页ppt

专知会员服务

71+阅读 · 2020年10月24日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

专知会员服务

14+阅读 · 2020年1月1日

【论文推荐】深度学习中贝叶斯不确定性简单基线（A simple baseline for bayesian uncertainty in deep learning）

【论文推荐】深度学习中贝叶斯不确定性简单基线（A simple baseline for bayesian uncertainty in deep learning）

专知会员服务

46+阅读 · 2019年12月25日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【推荐】迁移学习：用0.046%的训练样本(6张图片)超过2013 Kaggle猫狗识别竞赛领先水平（附代码）

【推荐】迁移学习：用0.046%的训练样本(6张图片)超过2013 Kaggle猫狗识别竞赛领先水平（附代码）

机器学习研究会

5+阅读 · 2017年9月24日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Tuna: A Static Analysis Approach to Optimizing Deep Neural Networks

Arxiv

0+阅读 · 2021年5月16日

System identification using Bayesian neural networks with nonparametric noise models

Arxiv

0+阅读 · 2021年5月14日

DeepPlastic: A Novel Approach to Detecting Epipelagic Bound Plastic Using Deep Visual Models

Arxiv

0+阅读 · 2021年5月13日

On the capacity of deep generative networks for approximating distributions

Arxiv

0+阅读 · 2021年5月13日

Efficient Algorithms for Estimating the Parameters of Mixed Linear Regression Models

Arxiv

0+阅读 · 2021年5月12日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Kernel Based Progressive Distillation for Adder Neural Networks

Arxiv

5+阅读 · 2020年9月29日

Improving Collaborative Metric Learning with Efficient Negative Sampling

Arxiv

3+阅读 · 2019年9月24日

Parsimonious Bayesian deep networks

Parsimonious Bayesian deep networks

Arxiv

5+阅读 · 2018年10月17日

A Dual Approach to Scalable Verification of Deep Networks

A Dual Approach to Scalable Verification of Deep Networks

Arxiv

3+阅读 · 2018年8月3日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

最新【深度生成模型】Deep Generative Models，104页ppt

最新【深度生成模型】Deep Generative Models，104页ppt

专知会员服务

71+阅读 · 2020年10月24日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

专知会员服务

14+阅读 · 2020年1月1日

【论文推荐】深度学习中贝叶斯不确定性简单基线（A simple baseline for bayesian uncertainty in deep learning）

【论文推荐】深度学习中贝叶斯不确定性简单基线（A simple baseline for bayesian uncertainty in deep learning）

专知会员服务

46+阅读 · 2019年12月25日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

【斯坦福博士论文】数据、决策与依赖：构建可信人工智能的挑战

人工智能时代背景下的未来海战

接触战中的无人机优势：美军旅级部队面临的小型无人机系统挑战与调整

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【推荐】迁移学习：用0.046%的训练样本(6张图片)超过2013 Kaggle猫狗识别竞赛领先水平（附代码）

【推荐】迁移学习：用0.046%的训练样本(6张图片)超过2013 Kaggle猫狗识别竞赛领先水平（附代码）

机器学习研究会

5+阅读 · 2017年9月24日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Tuna: A Static Analysis Approach to Optimizing Deep Neural Networks

Arxiv

0+阅读 · 2021年5月16日

System identification using Bayesian neural networks with nonparametric noise models

Arxiv

0+阅读 · 2021年5月14日

DeepPlastic: A Novel Approach to Detecting Epipelagic Bound Plastic Using Deep Visual Models

Arxiv

0+阅读 · 2021年5月13日

On the capacity of deep generative networks for approximating distributions

Arxiv

0+阅读 · 2021年5月13日

Efficient Algorithms for Estimating the Parameters of Mixed Linear Regression Models

Arxiv

0+阅读 · 2021年5月12日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Kernel Based Progressive Distillation for Adder Neural Networks

Arxiv

5+阅读 · 2020年9月29日

Improving Collaborative Metric Learning with Efficient Negative Sampling

Arxiv

3+阅读 · 2019年9月24日

Parsimonious Bayesian deep networks

Parsimonious Bayesian deep networks

Arxiv

5+阅读 · 2018年10月17日

A Dual Approach to Scalable Verification of Deep Networks

A Dual Approach to Scalable Verification of Deep Networks

Arxiv

3+阅读 · 2018年8月3日

微信扫码咨询专知VIP会员