Embedding-as-a-Service (EaaS) is an effective and convenient deployment solution for addressing various NLP tasks. Nevertheless, recent research has shown that EaaS is vulnerable to model extraction attacks, which could lead to significant economic losses for model providers. For copyright protection, existing methods inject watermark embeddings into text embeddings and use them to detect copyright infringement. However, current watermarking methods often resist only a subset of attacks and fail to provide \textit{comprehensive} protection. To this end, we present the region-triggered semantic watermarking framework called RegionMarker, which defines trigger regions within a low-dimensional space and injects watermarks into text embeddings associated with these regions. By utilizing a secret dimensionality reduction matrix to project onto this subspace and randomly selecting trigger regions, RegionMarker makes it difficult for watermark removal attacks to evade detection. Furthermore, by embedding watermarks across the entire trigger region and using the text embedding as the watermark, RegionMarker is resilient to both paraphrasing and dimension-perturbation attacks. Extensive experiments on various datasets show that RegionMarker is effective in resisting different attack methods, thereby protecting the copyright of EaaS.
翻译:嵌入即服务(EaaS)是一种用于解决各类自然语言处理任务的高效便捷部署方案。然而,近期研究表明,EaaS易受模型提取攻击,可能导致模型提供商遭受重大经济损失。为保护版权,现有方法通过向文本嵌入中注入水印嵌入,并利用其检测侵权行为。但当前水印方法通常仅能抵抗部分攻击,无法提供全面保护。为此,我们提出了一种名为RegionMarker的区域触发语义水印框架,该框架在低维空间中定义触发区域,并将水印注入与这些区域关联的文本嵌入中。通过使用秘密降维矩阵投影至该子空间并随机选择触发区域,RegionMarker使得水印去除攻击难以规避检测。此外,通过在完整触发区域中嵌入水印并以文本嵌入作为水印载体,RegionMarker对文本复述攻击和维度扰动攻击均具备鲁棒性。在多组数据集上的大量实验表明,RegionMarker能有效抵抗不同攻击方法,从而切实保护EaaS的版权。