Cloud service providers have launched Machine-Learning-as-a-Service (MLaaS) platforms to allow users to access large-scale cloudbased models via APIs. In addition to prediction outputs, these APIs can also provide other information in a more human-understandable way, such as counterfactual explanations (CF). However, such extra information inevitably causes the cloud models to be more vulnerable to extraction attacks which aim to steal the internal functionality of models in the cloud. Due to the black-box nature of cloud models, however, a vast number of queries are inevitably required by existing attack strategies before the substitute model achieves high fidelity. In this paper, we propose a novel simple yet efficient querying strategy to greatly enhance the querying efficiency to steal a classification model. This is motivated by our observation that current querying strategies suffer from decision boundary shift issue induced by taking far-distant queries and close-to-boundary CFs into substitute model training. We then propose DualCF strategy to circumvent the above issues, which is achieved by taking not only CF but also counterfactual explanation of CF (CCF) as pairs of training samples for the substitute model. Extensive and comprehensive experimental evaluations are conducted on both synthetic and real-world datasets. The experimental results favorably illustrate that DualCF can produce a high-fidelity model with fewer queries efficiently and effectively.
翻译:云层服务供应商已推出云层服务平台(MLaaaS),让用户能够通过API获取大型云基模型。除了预测产出外,这些光学信息还可以以更人理解的方式提供其他信息,例如反事实解释(CF),然而,这种额外信息不可避免地导致云层模型更容易受到旨在窃取云层模型内部功能的抽取攻击。然而,由于云层模型的黑箱性质,现有攻击战略不可避免地需要大量查询,才能使替代模型达到高度忠诚。除了预测产出外,我们提出了新的简单而高效的查询战略,以大大提高查询效率,以窃取一个分类模型。我们的观察意见是,当前的质疑战略会受到决定边界转移问题的影响,而这种转移的目的是为了窃取云层模型的内部功能。我们然后提议,由于云层模型的黑箱性质,在替代模型实现上述问题之前,现有攻击战略必然需要大量查询。我们提议采用新的简单、高效的查询战略,以便大大降低对CFF(CFFF)的理论解释。我们提出了新的查询策略,可以有效地用一个综合和高额的实验性实验性模型,用来解释。