Adversarial examples are crafted by adding indistinguishable perturbations to normal examples in order to fool a well-trained deep learning model to misclassify. In the context of computer vision, this notion of indistinguishability is typically bounded by $L_{\infty}$ or other norms. However, these norms are not appropriate for measuring indistinguishiability for time series data. In this work, we propose adversarial examples in the Wasserstein space for time series data for the first time and utilize Wasserstein distance to bound the perturbation between normal examples and adversarial examples. We introduce Wasserstein projected gradient descent (WPGD), an adversarial attack method for perturbing univariant time series data. We leverage the closed-form solution of Wasserstein distance in the 1D space to calculate the projection step of WPGD efficiently with the gradient descent method. We further propose a two-step projection so that the search of adversarial examples in the Wasserstein space is guided and constrained by Euclidean norms to yield more effective and imperceptible perturbations. We empirically evaluate the proposed attack on several time series datasets in the healthcare domain. Extensive results demonstrate that the Wasserstein attack is powerful and can successfully attack most of the target classifiers with a high attack success rate. To better study the nature of Wasserstein adversarial example, we evaluate a strong defense mechanism named Wasserstein smoothing for potential certified robustness defense. Although the defense can achieve some accuracy gain, it still has limitations in many cases and leaves space for developing a stronger certified robustness method to Wasserstein adversarial examples on univariant time series data.
翻译:对抗样本是指通过向正常样本添加无法区分的扰动来欺骗训练良好的深度学习模型,以误分类。在计算机视觉上下文中,这种无法区分的概念通常由$L_{\infty}$或其他范数限制。然而,这些范数不适用于测量时间序列数据的不可区分性。在本文中,我们首次提出在Wasserstein空间中的时间序列数据中的对抗样本,并利用Wasserstein距离来限制正常样例和对抗样例之间的扰动。我们引入了Wasserstein投影梯度下降(WPGD),这是一种针对单变量时间序列数据的对抗性攻击方法。我们利用1D空间中Wasserstein距离的闭合形式解来使用梯度下降法有效地计算WPGD的投影步骤。我们进一步提出了两步投影,以便在欧几里得范数的指导和约束下搜索Wasserstein空间中的对抗样本,从而产生更有效和难以察觉的干扰。我们在多个医疗领域的时间序列数据集上对所提出的攻击进行了实证评估。广泛的结果表明,Wasserstein攻击是强大的,可以成功地攻击大多数目标分类器,具有较高的攻击成功率。为了更好地研究Wasserstein对抗样例的本质,我们评估了一种强大的防御机制,称为Wasserstein平滑,以便进行潜在的认证鲁棒性防御。尽管该防御可以实现一定的准确率提高,但在许多情况下仍然存在局限性,并为开发更强大的对Wasserstein时间序列数据中对抗样本的认证鲁棒性方法留下了空间。