利用可解释的机器学习技术进行医学数据攻击性反逆探测 (Attack-agnostic Adversarial Detection on Medical Data Using Explainable Machine Learning)

Explainable machine learning has become increasingly prevalent, especially in healthcare where explainable models are vital for ethical and trusted automated decision making. Work on the susceptibility of deep learning models to adversarial attacks has shown the ease of designing samples to mislead a model into making incorrect predictions. In this work, we propose a model agnostic explainability-based method for the accurate detection of adversarial samples on two datasets with different complexity and properties: Electronic Health Record (EHR) and chest X-ray (CXR) data. On the MIMIC-III and Henan-Renmin EHR datasets, we report a detection accuracy of 77% against the Longitudinal Adversarial Attack. On the MIMIC-CXR dataset, we achieve an accuracy of 88%; significantly improving on the state of the art of adversarial detection in both datasets by over 10% in all settings. We propose an anomaly detection based method using explainability techniques to detect adversarial samples which is able to generalise to different attack methods without a need for retraining.

翻译：可解释的机器学习越来越普遍,特别是在保健领域,对道德和可信赖的自动化决策至关重要的可解释模型在保健领域尤为普遍。关于深层次学习模型对对抗性攻击的易感性的工作表明,设计样本很容易使模型误入不正确的预测。在这项工作中,我们提出了一个基于不可解释性的解释性模型方法,用于准确检测两个数据组的敌对性样本,这两个数据组具有不同复杂和特性:电子健康记录和胸部X射线数据。关于MIMIC-III和河南-雷明 EHR数据集,我们报告,对长度反向攻击的探测精确度为77%。在MIC-CXR数据集中,我们实现了88%的精确度;在所有环境中,两个数据集的对抗性探测能力都大大改进了10%以上。我们提出了一种基于可解释性探测方法,用解释性技术探测对抗性样品,可以对不同攻击方法进行概括,而无需再培训。

相关内容

Machine Learning

关注 2240

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

专知会员服务

39+阅读 · 2020年11月3日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日