Stylometric approaches have been shown to be quite effective for real-world authorship attribution. To mitigate the privacy threat posed by authorship attribution, researchers have proposed automated authorship obfuscation approaches that aim to conceal the stylometric artefacts that give away the identity of an anonymous document's author. Recent work has focused on authorship obfuscation approaches that rely on black-box access to an attribution classifier to evade attribution while preserving semantics. However, to be useful under a realistic threat model, it is important that these obfuscation approaches work well even when the adversary's attribution classifier is different from the one used internally by the obfuscator. Unfortunately, existing authorship obfuscation approaches do not transfer well to unseen attribution classifiers. In this paper, we propose an ensemble-based approach for transferable authorship obfuscation. Our experiments show that if an obfuscator can evade an ensemble attribution classifier, which is based on multiple base attribution classifiers, it is more likely to transfer to different attribution classifiers. Our analysis shows that ensemble-based authorship obfuscation achieves better transferability because it combines the knowledge from each of the base attribution classifiers by essentially averaging their decision boundaries.
翻译:为了减轻作者归属造成的隐私威胁,研究人员提出了自动作者模糊的方法,旨在隐藏使匿名文件作者身份消失的标本文物。最近的工作重点是作者模糊的方法,依靠黑箱进入归属分类者逃避归属,同时保留语义。然而,如果在现实的威胁模式下有用,这些模糊的方法必须行之有效,即使对手的归属分类与困惑分类者内部使用的方法不同,也必须很好地发挥作用。不幸的是,现有的作者混淆方法并没有很好地转移给隐蔽的归属分类者。在本文件中,我们建议对可转让作者归属分类者采用基于共同法的方法,以规避归属归属归属者,同时保留语义。我们的实验表明,如果一个模糊的分类者能够回避基于多个基础归属分类者的混合归属分类,则更有可能向不同的归属分类者转移。我们的分析表明,目前的作者模糊方法并没有很好地转移,因为每一种基于可实现的分类法的分类法系,因此,其每个基于平均分类法的分类法的分类法都具有更好的可交叉性。