Generative linguistic steganography mainly utilized language models and applied steganographic sampling (stegosampling) to generate high-security steganographic text (stegotext). However, previous methods generally lead to statistical differences between the conditional probability distributions of stegotext and natural text, which brings about security risks. In this paper, to further ensure security, we present a novel provably secure generative linguistic steganographic method ADG, which recursively embeds secret information by Adaptive Dynamic Grouping of tokens according to their probability given by an off-the-shelf language model. We not only prove the security of ADG mathematically, but also conduct extensive experiments on three public corpora to further verify its imperceptibility. The experimental results reveal that the proposed method is able to generate stegotext with nearly perfect security.
翻译:生成语言结构学主要利用语言模型和应用的色谱抽样(Stegoscamping)生成高安全度的色谱文本(stegotext),然而,以往的方法通常导致在静态文本和自然文本的有条件概率分布之间存在统计差异,从而带来安全风险。在本文中,为了进一步确保安全,我们提出了一个新颖的、安全的基因化语言色谱法ADG,它根据现成语言模型提供的概率,通过调适性动态图象组将秘密信息循环嵌入。我们不仅从数学角度证明了ADG的安全性,而且还对三个公共公司进行了广泛的实验,以进一步核实其不易感知性。实验结果显示,拟议的方法能够产生近乎完美安全的色谱。