We introduce a new type of format-transforming encryption where the format of ciphertexts is implicitly encoded within a machine-learned generative model. Around this primitive, we build a system for covert messaging over large, public internet platforms (e.g., Twitter). Loosely, our system composes an authenticated encryption scheme, with a method for encoding random ciphertext bits into samples from the generative model's family of seed-indexed token-distributions. By fixing a deployment scenario, we are forced to consider system-level and algorithmic solutions to real challenges -- ~such as receiver-side parsing ambiguities, and the low information-carrying capacity of actual token-distributions~ -- that were elided in prior work. We use GPT-2 as our generative model so that our system cryptographically transforms plaintext bitstrings into natural-language covertexts suitable for posting to public platforms. We consider adversaries with full view of the internet platform's content, whose goal is to surface posts that are using our system for covert messaging. We carry out a suite of experiments to provide heuristic evidence of security and to explore tradeoffs between operational efficiency and detectability.
翻译:我们引入了一种新型的格式转换加密, 将密码文本的格式隐含地编码在一个机器- 学习的基因模型中。 在这个原始的模型中, 我们建立一个在大型公共互联网平台( 如推特 ) 上秘密传递消息的系统。 我们系统非常容易地组成一个经认证的加密方案, 使用一种将随机密码字符串进行编码的方法, 从基因模型的样本中, 将原始索引式象征性分配的样本变成。 通过确定部署设想方案, 我们不得不考虑系统级别和算法上的实际挑战解决方案 -- -- 例如接收方的模糊性, 以及实际代号分配- 的低信息传递能力- 之前的工作所精细化的系统。 我们使用GPT-2作为我们的基因化模型, 以便我们的系统在加密时将普通文本比特字符转换成适合张贴到公共平台的自然语言封面。 我们考虑的是适配者, 使用互联网平台的内容, 其目标就是用我们的系统来进行隐藏信息传递的表面位置。 我们用一个操作效率的实验套件, 来探索安全性, 来进行交易。