Machine learning (ML) models are applied in an increasing variety of domains. The availability of large amounts of data and computational resources encourages the development of ever more complex and valuable models. These models are considered intellectual property of the legitimate parties who have trained them, which makes their protection against stealing, illegitimate redistribution, and unauthorized application an urgent need. Digital watermarking presents a strong mechanism for marking model ownership and, thereby, offers protection against those threats. This work presents a taxonomy identifying and analyzing different classes of watermarking schemes for ML models. It introduces a unified threat model to allow structured reasoning on and comparison of the effectiveness of watermarking methods in different scenarios. Furthermore, it systematizes desired security requirements and attacks against ML model watermarking. Based on that framework, representative literature from the field is surveyed to illustrate the taxonomy. Finally, shortcomings and general limitations of existing approaches are discussed, and an outlook on future research directions is given.
翻译:大量数据和计算资源的可用性鼓励了日益复杂和有价值的模型的开发。这些模型被视为培训了这些模型的合法当事方的知识产权,这使得保护他们免遭盗窃、非法再分配和未经授权的应用成为一项紧迫的需要。数字水标记是标记模型所有权的强有力机制,从而提供保护,防止这些威胁。这项工作为ML模型提供了一个分类学,查明和分析不同类别的水标记计划。它引入了一个统一的威胁模型,以便能够对不同情况下的水标记方法的有效性进行有条理的推理和比较。此外,它系统化了所需的安全要求和对ML模型水标记的攻击。根据这个框架,对实地的代表性文献进行了调查,以说明分类学。最后,讨论了现有方法的缺点和一般局限性,并提出了未来研究方向的前景。