通过数据中毒愚弄部分依赖性 (Fooling Partial Dependence via Data Poisoning) - 专知论文

会员服务 ·

0

Performer · MoDELS · 可理解性 · 黑盒 · 稳健性 ·

2021 年 5 月 26 日

Fooling Partial Dependence via Data Poisoning

翻译：通过数据中毒愚弄部分依赖性

Hubert Baniecki,Wojciech Kretowicz,Przemyslaw Biecek

Many methods have been developed to understand complex predictive models and high expectations are placed on post-hoc model explainability. It turns out that such explanations are not robust nor trustworthy, and they can be fooled. This paper presents techniques for attacking Partial Dependence (plots, profiles, PDP), which are among the most popular methods of explaining any predictive model trained on tabular data. We showcase that PD can be manipulated in an adversarial manner, which is alarming, especially in financial or medical applications where auditability became a must-have trait supporting black-box models. The fooling is performed via poisoning the data to bend and shift explanations in the desired direction using genetic and gradient algorithms. To the best of our knowledge, this is the first work performing attacks on variable dependence explanations. The novel approach of using a genetic algorithm for doing so is highly transferable as it generalizes both ways: in a model-agnostic and an explanation-agnostic manner.

翻译：已经开发了许多方法来理解复杂的预测模型,对后热量模型的解释期望很高。事实证明, 这些解释并不可靠, 也不可靠, 并且可以被愚弄。本文介绍了攻击部分依赖性的技术( 绘图、剖面图、 PDP ), 这是最受欢迎的解释任何关于表格数据培训的预测模型的方法之一。我们展示了PD可以以对抗的方式被操纵,这令人震惊, 特别是在财务或医疗应用中, 审计变得必须具备支持黑盒模型的特性。愚弄是通过用遗传和梯度算法使数据弯曲和改变对理想方向的解释来进行的。据我们所知, 这是第一次对可变依赖性解释进行攻击。使用基因算法来这样做的新颖的方法可以高度可转让, 因为它概括了两种方法: 以模型- 不可知性和解释- 不可知性方式。

0

相关内容

Performer

【NeurIPS 2020】图神经网络的参数化解释器，Parameterized Explainer for GNN

【NeurIPS 2020】图神经网络的参数化解释器，Parameterized Explainer for GNN

专知会员服务

22+阅读 · 2020年11月13日

【KDD2020-Tutorial】对抗性的攻击和防御:前沿、进展与实践，171页ppt

【KDD2020-Tutorial】对抗性的攻击和防御:前沿、进展与实践，171页ppt

专知会员服务

80+阅读 · 2020年8月24日

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

专知会员服务

26+阅读 · 2020年7月24日

机器学习隐私综述论文，An Overview of Privacy in Machine Learning

机器学习隐私综述论文，An Overview of Privacy in Machine Learning

专知会员服务

81+阅读 · 2020年5月20日

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

专知会员服务

19+阅读 · 2020年4月25日

【Google新论文】Learning Transferable Graph Exploration 附论文下载

【Google新论文】Learning Transferable Graph Exploration 附论文下载

专知会员服务

8+阅读 · 2019年11月4日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

4+阅读 · 2017年12月5日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

On the Veracity of Local, Model-agnostic Explanations in Audio Classification: Targeted Investigations with Adversarial Examples

On the Veracity of Local, Model-agnostic Explanations in Audio Classification: Targeted Investigations with Adversarial Examples

Arxiv

0+阅读 · 2021年7月19日

Towards the Unification and Robustness of Perturbation and Gradient Based Explanations

Arxiv

0+阅读 · 2021年7月19日

BRR: Preserving Privacy of Text Data Efficiently on Device

Arxiv

0+阅读 · 2021年7月16日

Data Poisoning Attacks and Defenses to Crowdsourcing Systems

Arxiv

8+阅读 · 2021年2月18日

Backdoor Learning: A Survey

Arxiv

14+阅读 · 2020年10月26日

Adversarial Metric Attack for Person Re-identification

Adversarial Metric Attack for Person Re-identification

Arxiv

3+阅读 · 2019年1月30日

Adversarial Meta-Learning

Arxiv

7+阅读 · 2018年6月8日

Fooling OCR Systems with Adversarial Text Images

Arxiv

3+阅读 · 2018年2月15日

Active Learning from Positive and Unlabeled Data

Arxiv

3+阅读 · 2016年2月24日

Explaining and Harnessing Adversarial Examples

Arxiv

4+阅读 · 2015年3月20日

VIP会员

文章信息

相关主题

相关VIP内容

【NeurIPS 2020】图神经网络的参数化解释器，Parameterized Explainer for GNN

【NeurIPS 2020】图神经网络的参数化解释器，Parameterized Explainer for GNN

专知会员服务

22+阅读 · 2020年11月13日

【KDD2020-Tutorial】对抗性的攻击和防御:前沿、进展与实践，171页ppt

【KDD2020-Tutorial】对抗性的攻击和防御:前沿、进展与实践，171页ppt

专知会员服务

80+阅读 · 2020年8月24日

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

最新《人脸识别对抗攻击》综述 | Threat of Adversarial Attacks on Face Recognition: A Comprehensive Survey

专知会员服务

26+阅读 · 2020年7月24日

机器学习隐私综述论文，An Overview of Privacy in Machine Learning

机器学习隐私综述论文，An Overview of Privacy in Machine Learning

专知会员服务

81+阅读 · 2020年5月20日

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

多语言神经机器翻译综述论文，34页pdf，A Comprehensive Survey of Multilingual Neural Machine Translation

专知会员服务

19+阅读 · 2020年4月25日

【Google新论文】Learning Transferable Graph Exploration 附论文下载

【Google新论文】Learning Transferable Graph Exploration 附论文下载

专知会员服务

8+阅读 · 2019年11月4日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

4+阅读 · 2017年12月5日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

On the Veracity of Local, Model-agnostic Explanations in Audio Classification: Targeted Investigations with Adversarial Examples

On the Veracity of Local, Model-agnostic Explanations in Audio Classification: Targeted Investigations with Adversarial Examples

Arxiv

0+阅读 · 2021年7月19日

Towards the Unification and Robustness of Perturbation and Gradient Based Explanations

Arxiv

0+阅读 · 2021年7月19日

BRR: Preserving Privacy of Text Data Efficiently on Device

Arxiv

0+阅读 · 2021年7月16日

Data Poisoning Attacks and Defenses to Crowdsourcing Systems

Arxiv

8+阅读 · 2021年2月18日

Backdoor Learning: A Survey

Arxiv

14+阅读 · 2020年10月26日

Adversarial Metric Attack for Person Re-identification

Adversarial Metric Attack for Person Re-identification

Arxiv

3+阅读 · 2019年1月30日

Adversarial Meta-Learning

Arxiv

7+阅读 · 2018年6月8日

Fooling OCR Systems with Adversarial Text Images

Arxiv

3+阅读 · 2018年2月15日

Active Learning from Positive and Unlabeled Data

Arxiv

3+阅读 · 2016年2月24日

Explaining and Harnessing Adversarial Examples

Arxiv

4+阅读 · 2015年3月20日

微信扫码咨询专知VIP会员