Full-precision deep learning models are typically too large or costly to deploy on edge devices. To accommodate to the limited hardware resources, models are adapted to the edge using various edge-adaptation techniques, such as quantization and pruning. While such techniques may have a negligible impact on top-line accuracy, the adapted models exhibit subtle differences in output compared to the original model from which they are derived. In this paper, we introduce a new evasive attack, DIVA, that exploits these differences in edge adaptation, by adding adversarial noise to input data that maximizes the output difference between the original and adapted model. Such an attack is particularly dangerous, because the malicious input will trick the adapted model running on the edge, but will be virtually undetectable by the original model, which typically serves as the authoritative model version, used for validation, debugging and retraining. We compare DIVA to a state-of-the-art attack, PGD, and show that DIVA is only 1.7-3.6% worse on attacking the adapted model but 1.9-4.2 times more likely not to be detected by the the original model under a whitebox and semi-blackbox setting, compared to PGD.
翻译:全精深学习模型通常过于庞大或费用昂贵,无法在边缘设备上部署。为了适应有限的硬件资源,模型使用各种边缘适应技术,例如量化和裁剪等,对边缘进行改造。虽然这些技术对上线精确度的影响可能微不足道,但经调整的模型与原始模型相比,其产出与原始模型相比在产出上表现出细微的差别。在本文中,我们引入了一种新的蒸发式攻击DIVA,利用边缘适应方面的这些差异,在输入数据中添加对抗性噪音,使原始模型和经调整的模型之间的输出差异最大化。这种攻击特别危险,因为恶意输入会操纵在边缘运行的经调整模型,但几乎无法被原始模型探测,该模型通常作为权威模型,用于验证、调试和再培训。我们将DIVA与最新技术攻击(PGD)进行比较,并表明DIVA在攻击经调整的模型时比原模型要差1.7-3.6 %,但1.9-4.2倍的可能性更大,不会被原模型在白框和半黑框下比较的原始模型探测。