Deep Learning (DL) models increasingly power a diversity of applications. Unfortunately, this pervasiveness also makes them attractive targets for extraction attacks which can steal the architecture, parameters, and hyper-parameters of a targeted DL model. Existing extraction attack studies have observed varying levels of attack success for different DL models and datasets, yet the underlying cause(s) behind their susceptibility often remain unclear. Ascertaining such root-cause weaknesses would help facilitate secure DL systems, though this requires studying extraction attacks in a wide variety of scenarios to identify commonalities across attack success and DL characteristics. The overwhelmingly high technical effort and time required to understand, implement, and evaluate even a single attack makes it infeasible to explore the large number of unique extraction attack scenarios in existence, with current frameworks typically designed to only operate for specific attack types, datasets and hardware platforms. In this paper we present PINCH: an efficient and automated extraction attack framework capable of deploying and evaluating multiple DL models and attacks across heterogeneous hardware platforms. We demonstrate the effectiveness of PINCH by empirically evaluating a large number of previously unexplored extraction attack scenarios, as well as secondary attack staging. Our key findings show that 1) multiple characteristics affect extraction attack success spanning DL model architecture, dataset complexity, hardware, attack type, and 2) partially successful extraction attacks significantly enhance the success of further adversarial attack staging.
翻译:深度学习(DL)模型日益增强多种应用的力量。不幸的是,这种普遍性还使它们具有吸引力的抽取攻击目标,从而可以窃取定向DL模型的结构、参数和超参数。现有的抽取攻击研究发现,不同DL模型和数据集的攻击成功程度不同,但其易感性背后的根本原因往往仍然不清楚。确定这种根源性弱点将有助于建立安全的DL系统,尽管这需要在各种情景中研究抽取攻击,以确定攻击成功和DL特征之间的共性。即使是一次攻击也需要大量技术努力和时间才能理解、执行和评价,这使得无法探索现有的大量独特的抽取攻击情景,而目前的框架通常只设计为特定攻击类型、数据集和硬件平台运作。本文介绍PINCH:一个高效和自动化的抽取攻击框架,能够部署和评价多种DL模型和不同硬件平台的攻击。我们通过实证评估大量先前未探索的抽取攻击情景,以及作为二次攻击的重要次攻击形式,来证明PINCH的有效性。我们的主要调查结果显示,成功程度的抽取模型,以及作为多重攻击的递取性攻击的进度结构。