Neural Architecture Search (NAS) represents an emerging machine learning (ML) paradigm that automatically searches for models tailored to given tasks, which greatly simplifies the development of ML systems and propels the trend of ML democratization. Yet, little is known about the potential security risks incurred by NAS, which is concerning given the increasing use of NAS-generated models in critical domains. This work represents a solid initial step towards bridging the gap. Through an extensive empirical study of 10 popular NAS methods, we show that compared with their manually designed counterparts, NAS-generated models tend to suffer greater vulnerability to various malicious attacks (e.g., adversarial evasion, model poisoning, and functionality stealing). Further, with both empirical and analytical evidence, we provide possible explanations for such phenomena: given the prohibitive search space and training cost, most NAS methods favor models that converge fast at early training stages; this preference results in architectural properties associated with attack vulnerability (e.g., high loss smoothness and low gradient variance). Our findings not only reveal the relationships between model characteristics and attack vulnerability but also suggest the inherent connections underlying different attacks. Finally, we discuss potential remedies to mitigate such drawbacks, including increasing cell depth and suppressing skip connects, which lead to several promising research directions.
翻译:神经建筑搜索(NAS)是一个新兴的机器学习(ML)范例,它自动搜索适合特定任务的模式,大大简化了ML系统的发展,推动了ML民主化趋势;然而,对于NAS的潜在安全风险,却知之甚少,因为NAS在关键领域越来越多地使用NAS产生的模型。这项工作是缩小差距的坚实的第一步。通过对10种流行NAS方法的广泛经验研究,我们发现,NAS产生的模型与其手工设计的对等模型相比,往往更容易受到各种恶意攻击(例如对抗性规避、模式中毒和功能盗窃)的影响。此外,根据经验证据和分析证据,我们对这种现象提供了可能的解释:由于搜索空间和训练成本过高,大多数NAS方法都倾向于在早期培训阶段迅速结合的模式;这种偏好的结果与攻击脆弱性有关的建筑特性(例如高损失平滑和低坡度差异)有关。我们的调查结果不仅揭示了模型特征和攻击脆弱性之间的关系,而且还表明不同攻击背后的内在联系。我们讨论的是,因为有经验证据和分析证据,我们可能对这类现象作出解释:鉴于搜索空间和训练成本过高,大多数方法都倾向于在早期训练阶段迅速结合;这种可能的补救办法,包括逐步缩小各种研究的深度。