We present a novel watermarking scheme to verify the ownership of DNN models. Existing solutions embedded watermarks into the model parameters, which were proven to be removable and detectable by an adversary to invalidate the protection. In contrast, we propose to implant watermarks into the model architectures. We design new algorithms based on Neural Architecture Search (NAS) to generate watermarked architectures, which are unique enough to represent the ownership, while maintaining high model usability. We further leverage cache side channels to extract and verify watermarks from the black-box models at inference. Theoretical analysis and extensive evaluations show our scheme has negligible impact on the model performance, and exhibits strong robustness against various model transformations.
翻译:我们提出了一个用于核查DNN模型所有权的新式水标记计划。 现有的解决方案将水印嵌入模型参数,这些参数被证明可以拆除,并且可以被对手探测到,以宣布保护无效。 相反,我们提议将水印植入模型结构中。 我们设计了基于神经结构搜索(NAS)的新算法,以生成水标记结构,这些结构是独一无二的,足以代表所有权,同时保持高型号可用性。 我们进一步利用缓存侧渠道从黑盒模型中提取和核查水印。 理论分析和广泛的评估表明,我们的计划对模型性能的影响微乎其微,并展示了对各种模型转型的强大强力。